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Introduction 

Ever  since  spectrometric  instruments  have  been  coupled  with  computers,  there  has  been  a 
need  to  interchange  the  data  produced  by  the  instrument  so  that  they  can  be  used  in  other 
applications. 

Each  instrument’s  software  usually  has  its  own  native  data  format,  which  is  often  incompatible 
with  others.  Because  of  this,  a number  of  data  interchange  formats  have  been  developed  over 
the  years  so  that  different  applications  can  share  data,  as  long  as  each  application  supports 
the  import  and  export  functions  of  the  interchange  format. 

This  approach  is  functional,  but  it  has  several  disadvantages: 

• The  current  interchange  formats  are  fixed,  which  means  it  is  not  possible  to  add  new  data 
elements  easily. 

• The  interchange  data  structure  is  fixed,  which  forces  the  data  elements  to  be  maintained  in 
a precise  order. 

• The  various  instrument  software  and  interchange  formats  do  not  all  convey  the  same 
information,  so  data  can  be  lost  in  the  interconversion. 

• Some  of  the  interchange  formats  encompass  a wide  variety  of  instrument  data  resulting  in 
a huge  number  of  data  elements,  many  of  which  are  not  needed  for  a given  application. 

• Many  applications  do  not  support  all  formats. 

• Result  metadata  (descriptive  elements  concerning  the  data)  and  information  about  the 
sample  and  the  measurement  process  are  often  omitted. 

• Current  interchange  mechanisms  are  not  compatible  with  modern  computer  network 
technologies. 

Beyond  these  difficulties,  interchange  developers,  instrument  manufacturers,  and  end  users 
have  often  worked  against  each  other  in  developing  consistent  standards  for  data  interchange. 
Consequently,  today  there  is  no  single  standard  way  to  exchange  or  visualize  scientific 
instrument  data. 

Use  of  an  extensible  markup  language  for  data  interchange  can  solve  most  of  these  difficulties. 
The  concept  of  an  XML  is  to  enclose  data  elements  between  tags  that  identify  the  data 
element  by  name  and  attributes.  The  most  famous  of  the  XML  languages  is  HTML  (hypertext 
markup  language),  the  lingua  franca  of  the  Internet.  Together  with  its  type  definition,  a marked- 
up  document  can  be  easily  interchanged,  processed,  stored,  and  visualized  by  numerous 
applications — many  of  them  already  developed  for  Internet  use  XML  documents  are  thus  free 
of  ties  to  specific  systems  or  manufacturers. 

To  demonstrate  the  utility  of  an  XML  approach  for  instrument  data  interchange,  SpectroML 
was  created  - a markup  language  for  molecular  spectroscopy  data.  At  present  SpectroML  is 
being  developed  solely  for  UVA/is  data  to  keep  the  scope  of  the  project  manageable.  This 
document  describes  this  markup  language  and  its  environment,  shows  its  structure  and 
elements,  and  gives  examples  and  an  outlook  on  applications. 
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UV/Vis  Formats  and  the  Way  to  SpectroML 

To  develop  the  structure  and  an  initial  vocabulary  for  SpectroML,  three  existing  data 
interchange  formats  (GRAMS  [1],  JCAMP-DX  [2],  and  ANDI  [3])  were  compared,  relevant 
ASTM  definitions  [4]  were  consulted,  and  all  items  related  to  UVA/is  spectroscopy  were 
extracted.  [Appendix  A documents  this  effort.] 

The  structure  developed  while  analyzing  these  formats  provided  a good  base  for  the  XML 
vocabulary.  Each  of  the  existing  formats  views  the  data  differently  and  emphasizes  different 
elements.  Selecting  those  elements  that  suit  most  applications  in  UVA/is  spectroscopy 
provides  a good  initial  vocabulary.  Some  of  the  elements  can  be  sub-divided  to  avoid  having 
more  than  one  piece  of  information  per  element.  By  using  XML,  the  vocabulary  can  be 
extended  fairly  easily,  so  that  one  can  have,  for  example,  an  extended  SpectroML  containing 
elements  for  his/her  own  usage  together  with  the  core  that  is  general  for  all  applications.  [See 
Appendix  B for  a short  introduction  to  XML.] 

With  the  initial  vocabulary  in  hand,  a DTD  (document  type  definition)  can  be  developed.  This 
allows  checking  an  XML  file  to  determine  its  correctness.  An  XML  schema  can  be  developed 
to  provide  a more  powerful  way  of  validating  XML  documents. 

There  are  different  possibilities  for  visualizing  the  data.  For  example,  using  XSL  (Extended 
Stylesheet  Language)  stylesheets,  the  information  of  the  file  can  be  displayed  in  various  ways, 
and  users  can  easily  adapt  this  to  their  needs. 

Moreover,  applications  or  plug-ins  for  applications  can  be  developed  to  use  SpectroML  in 
multiple  ways.  The  Java  platform  is  attractive  for  developing  these,  because  it  provides 
platform  independence. 

A well-structured  XML  file  provides  a flexible,  powerful,  complete,  and  platform-independent 
way  to  store  UVA/is  data  and  exchange  them  over  the  Internet. 


XML  for  Molecular  spectrometry  Data 


Figure  1 - The  SpectroML  logo 
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Structure  of  SpectroML 

Based  on  our  analysis  of  existing  data  interchange  formats  for  molecular  spectrometry  data., 
we  developed  an  initial  vocabulary  and  organized  it  to  develop  a logical  and  regular  structure, 
and  extended  it  to  provide  a linking  mechanism.  Like  all  XML  documents,  the  structure  is 
arranged  hierarchically  like  a tree,  starting  from  a root  and  with  increasingly  detailed  sub- 
elements ending  in  the  leaves  as  shown  in  Figure  2: 

» The  root  element  (the  ground  in  Figure  2)  contains  one  or  more  experiments.  The 

individual  experiments  are  implicitly  related  by  being  grouped  into  one  document;  however, 
they  can  be  explicitly  related  via  linking  references. 

• Each  experiment  (the  tree  trunk  in  Figure  2)  contains  five  groups.  The  file  group  is  a 
header  group  that  describes  all  the  datasets  within  an  experiment.  Each  of  the  four 
groups — instrument,  sample,  measurement,  and  data — describes  a different  aspect  of  the 
dataset  and  contains  the  data  values  themselves. 

• Each  of  these  four  groups  (a  main  branch  in  Figure  2)  contains  two  different  blocks. 
Generally  speaking,  the  blocks  divide  the  group  data  into  a fixed  part  and  a variable  part. 
Each  of  these  two  blocks  can  appear  several  times.  Its  ID  (identification  string)  affords  the 
possibility  of  reusing  one  block  for  different  datasets  within  an  experiment.  For  example, 
one  instrument  can  be  used  with  several  samples,  without  repeating  it  for  each  dataset. 

• Each  block  (a  smaller  branch  in  Figure  2),  except  for  the  core  data,  contains  sections  (a 
smaller  branch  in  the  figure).  A section  divides  a block  into  different  sub-aspects.  In  this 
specification,  each  of  these  blocks  has  two  sections;  however,  this  is  not  mandatory  and 
can  be  expanded  in  future  versions. 

• Each  section  (a  twig  in  Figure  2)  contains  data  elements  to  hold  the  data  and  metadata. 

• Each  element  (a  leaf  in  Figure  2)  may  contain  sub-elements.  This  allows  storing  structured 
data  in  an  element.  Each  element  can  also  have  an  attribute,  such  as  a format  description 
for  the  data  contained. 


The  spectroscopy  method  is  an  attribute  of  an  experiment,  which  means  several  methods  can 
be  combined  within  one  SpectroML  file.  The  current  elements  focus  on  UVA/is,  but  the 
required  metadata  for  other  methods  can  be  added  in  future  since  the  structure  that  holds  the 
data  values  was  designed  for  a broad  range  of  data  structures. 

It  is  important  to  realize  that  even  though  XML  files  are  human-readable,  they  are  created  to 
be  processed  by  a computer.  The  hierarchy,  its  structure,  its  depth,  and  complexity  are 
designed  to  make  the  XML  file  "parsable"  by  a computer  and  foster  flexibility  and  extensibility. 
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Structure  of  SpectroML 


Figure  2 - The  hierarchical  structure  of  SpectroML 


Illustration  of  the  Structure  of  SpectroML 

• SpectroML  (the  ground  in  Figure  2)  supports  the  whole  structure. 

• An  experiment  (the  tree  trunk  in  Figure  2)  holds  information  about  the  whole  experiment 
and  contains  groups. 

• A group  (a  main  branch  in  Figure  2)  pertains  to  a specific  data  topic  and  contains  blocks. 

• A block  (a  smaller  branch  in  Figure  2)  separates  groups  into  different  units  and  contains 
sections. 

• A section  (a  twig  in  Figure  2)  divides  a block  into  smaller  units  of  related  data  and  contains 
elements. 

• An  element  (a  leaf  in  Figure  2)  holds  a metadata  or  data  value. 
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Datasets-Paths  Through  the  Experiment 

A dataset  is  a path  or  linkage  through  the  experiment  blocks.  Datasets  are  stored  in  the  file 
group  and  connect  all  eight  blocks  (two  of  each  remaining  group)  together  If  a given  block  is 
needed  in  a number  of  datasets,  it  can  be  reused  multiple  times  with  different  collections  of 
other  blocks  without  the  need  for  maintaining  copies  of  it.  Figure  3 illustrates  the  concept  of 
experiment  paths  within  SpectroML: 


Figure  3 - Experiment  paths  in  SpectroML 


The  eight  different  colors  (or  patterns,  respectively)  represent  the  eight  different  block  types. 
Each  block  type  can  appear  multiple  times  as  a discrete  block  (an  apple  in  the  figure)  in  the 
experiment  and  must  have  a unique  ID  (each  apple  in  the  figure  would  need  to  have  a unique 
ID,  for  example  "id  1 "id2,"  "id3"  for  each  of  the  three  "instrument  description"  blocks). 

A collection  of  exactly  one  block  of  each  block  type  is  a dataset  (the  basket  of  eight  apples  in 
the  figure).  A path  is  the  list  of  the  elements  of  this  set  consisting  of  the  eight  different  IDs  of 
the  blocks. 

Taking  Figure  3 as  a universe  of  possible  UVA/is  experiments  would  mean  that  there  were 
three  available  instruments  each  having  the  same  properties;  there  was  one  sample  with  a 
single  set  of  properties;  three  possible  measurements  all  with  the  same  properties;  and  three 
result  data  packages  all  with  the  same  properties. 
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Data  Handling 

SpectroML  is  capable  of  storing  multiple  datatypes: 

• single  data  points 

• a single  spectrum 

• multiple  spectra 

• multi-dimensional  data. 

Using  the  typical  XML  mechanism,  data  values  can  be  stored  in  a structure  as  illustrated  in  the 
following  example  showing  three  two-dimensional  data  points: 

<point> 

<x>l</x> 

<y>2</y> 

</point> 

<point> 

<x>2</x> 

<y>4</y> 

</point> 

<point> 

<x>3</x> 

<y>8</y> 

</point> 

But  since  spectra  often  contain  numerous  data  points,  this  simple  approach,  while  functional, 
would  be  unwieldy  because  of  its  huge  amount  of  overhead.  To  minimize  the  overhead, 
SpectroML  can  store  values  in  a more  compact  form  by  using  one  tag  for  the  values  of  one 
dimension  while  incorporating  the  data  as  a list  of  values  separated  by  a whitespace  character 
(e.g.,  space  or  tab): 

cvalues  dim="x">l  2 3</values> 

<values  dim="y">2  4 8</values> 

The  name  of  the  dimension  is  not  fixed  in  a tag,  but  is  variable  in  an  attribute;  this  allows  as 
many  dimensions  as  needed.  The  dimension  attribute  provides  the  link  between  the  data  and 
the  related  metadata  elements  (e.g.,  a minimum  value  or  a start  value): 

cvalues  dim=”x">l  2 3</values> 
cstartValue  dim= "x" >l</startValue> 

In  cases  where  the  data  values  are  mathematically  related  (such  as  evenly  spaced  x values), 
only  the  initial  (starting)  value  is  needed: 

cvalues  dim= "x" >lc/values> 
cvalues  dim="y">2  4 8c/values> 

Of  course,  when  this  approach  is  used,  one  has  to  provide  the  information  necessary  for 
calculating  the  actual  values  in  the  corresponding  metadata  block,  such  as  the  increment  value 
for  simple  accession  data. 
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SpectroML  Elements 

XML  tags  are  case  sensitive.  Tags  in  SpectroML  are  formed  according  to  the  following  rules: 

• Tags  contain  only  letters  from  the  English  alphabet  (ASCII  characters  65-90  and  97-122). 

• Tags  within  the  root  tag  <SpectroML>  begin  with  a lower  case  letter. 

• Each  new  word  in  a tag  starts  with  an  upper  case  letter  for  better  readability. 

• Abbreviations  are  avoided  in  tag  names  as  far  as  possible. 

• Wherever  a physical  value  occurs  as  element  content,  there  must  be  an  attribute  for  its 
unit. 

• Wherever  a data  value  or  calculating  property  occurs,  there  must  be  an  attribute  for  its 
dimension. 

In  the  tables  listed  on  the  following  pages,  there  are  different  types  of  elements: 

• eiementA : eiementB  refers  to  the  tree  structure  and  means  that  A is  the  parent  of  B. 

• element  is  a child  element  of  the  block  element  in  the  header  of  the  table. 

• ^element  is  an  element  which  can  occur  several  times. 

• ^element  is  a sub-element  of  the  element  above  it. 

• -element  is  an  attribute  of  the  element  above  it. 

If  an  element  has  sub-elements,  it  cannot  contain  data  itself  and  has  no  attribute.  It  is 
structured  to  group  information. 

Each  element  that  holds  data  and  each  attribute  must  have  a datatype.  An  element  always 
contains  only  character  data,  but  it  can  represent  a different  datatype,  e.g.,  a float  value.  The 
following  types  are  used: 

• string  for  character  data. 

• language  for  language  setting  of  elements  (ISO  639). 

• date,  time  for  dates  and  times  (ISO  8601). 

• id,  idref,  idrefs  for  identifier  and  references  (XML  DTD). 

• double,  doubles  for  float  values  (IEEE  754-1985)  and  space-separated  doubles. 

• unsignedlnt  for  positive  integer  values. 

• A dash  (-)  in  the  type  field  means  that  the  element  does  not  hold  data  and  thus  has  no 
type. 

The  descriptions  of  most  of  the  elements  are  taken  from  the  ASTM  definitions  [4], 
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Complete  Element  List 

In  the  following  tables  of  this  section,  the  elements  of  SpectroML  are  listed  in  the  order  of  the 
hierarchy  in  which  they  appear.  All  elements  within  the  four  metadata  groups  are  optional,  so 
that  the  user  of  SpectroML  can  decide  which  of  these  are  important  for  his/her  usage. 


SpectroML 

Element 

Type 

Description 

•version 

string 

The  version  of  SpectroML.  For  the  initial 

SpectroML  only  ‘1 .0'  exists. 

experiment 

- 

An  experiment  is  one  dataset.  This  provides  for  the 
possibility  of  collecting  several  experiments  in  one 
file. 

SpectroML:experiment 

Element 

Type 

Description 

• type 

string 

The  type  of  the  analytical  data  in  the  dataset.  For 
the  initial  SpectroML  only  'UVA/is'  exists. 

•language 

string 

The  language  of  the  data  elements  containing  text. 

•experiment  Id 

ID 

A unique  ID  for  the  complete  experiment.  This 
allows  other  experiments  to  refer  to  it. 

file 

- 

The  file  group  contains  information  related  to  the 
complete  experiment. 

instrument 

- 

The  instrument  group  contains  all  information 
related  to  the  instrument  used. 

sample 

- 

The  sample  group  contains  all  information  related 
to  the  sample  used. 

measurement 

- 

The  measurement  group  contains  all  information 
related  to  the  measuring  process. 

data 

- 

The  data  group  contains  all  information  related  to 
the  data  and  the  data  themselves. 
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SpectroML 


SpectroML  Elements 


SpectroML:experiment:file 

Element 

Type 

Description 

•experiment Links 

IDREFS 

A list  of  references  to  other  experiments  within  the 
file. 

•externalLinks 

string 

A list  of  references  to  any  external  data. 

The  title  of  the  experiment  should  be  a common 

title 

string 

name  or  a short  description  of  the  experiment  as  it 
would  appear  on  a document  or  graph.  This  title  is 
used  for  all  datasets  within  one  experiment. 

timeStamp 

- 

The  date  and  time  when  the  experiment  was  last 
modified. 

— »date 

date 

The  date  of  the  timestamp. 

— »time 

time 

The  day  of  the  timestamp. 

oopath 

- 

A path  connects  the  eight  blocks  of  one  dataset  by 
listing  their  IDs. 

•pathld 

ID 

A unique  ID  for  the  dataset  path.  This  allows  paths 
to  be  distinguished. 

•instrumentDescript ionLink 

IDREF 

The  ID  of  the  instrument  description  block. 

•ins trument Proper tvLink 

IDREF 

The  ID  of  the  instrument  property  block. 

•sampleDe script ionLink 

IDREF 

The  ID  of  the  sample  description  block. 

•samplePropertyLink 

IDREF 

The  ID  of  the  sample  property  block. 

•measurementDe script ionLink 

IDREF 

The  ID  of  the  measurement  description  block. 

•measurement Proper tyLink 

IDREF 

The  ID  of  the  measurement  property  block. 

•dataPropertyLink 

IDREF 

The  ID  of  the  data  property  block. 

•dataCoreLink 

IDREF 

The  ID  of  the  data  core  block. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 
block. 
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SpectroML  Elements 


SpectroML:experiment:instrument 

Element 

Type 

Description 

coins  trumentDe script  ion 

- 

The  description  block  contains  all  information  that 
describes  the  instrument  and  its  environment. 

coins  trument  Property 

- 

The  property  block  contains  all  instrument  settings 
that  can  be  adjusted. 

SpectroML:experiment:sample 

Element 

Type 

Description 

oosampleDescription 

- 

The  description  block  contains  all  information  that 
describes  the  sample  and  its  handling. 

oosample  Property 

- 

The  property  block  contains  all  characteristics  of 
the  sample. 

SpectroML:experiment:measurement 

Element 

Type 

Description 

^measurement Description 

- 

The  description  block  contains  all  information  that 
is  useful  for  recording  a measurement. 

^measurement Property 

- 

The  property  block  contains  measurement  settings 
adjusted  by  the  user. 

SpectroML:experiment:data 

Element 

Type 

Description 

°°dataProperty 

- 

The  property  block  contains  information  for  proper 
visualization  of  the  raw  data. 

°°dataCore 

- 

The  core  block  contains  the  raw  data  themselves. 
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SpectroML  Elements 


SpectroML:experiment:instrument:instrumentDescription 

Element 

Type 

Description 

•instrumentDescriptionld 

ID 

A unique  ID  for  the  instrument  description  block. 

instrumentDesignat ion 

- 

The  designation  section  contains  the  instrument's 
designation,  owner,  and  location. 

instrument Application 

- 

The  application  section  contains  the  instrument’s 
software  environment  and  the  operator’s  name. 

SpectroML:experiment:instrument:instrumentProperty 

Element 

Type 

Description 

•instrument Property Id 

ID 

A unique  ID  for  the  instrument  property  block. 

instrument Set ting 

- 

The  setting  section  contains  the  instrument’s 
inherent  parameters. 

instrument Parameter 

- 

The  parameter  section  contains  the  instrument’s 
adjustable  parameters. 

SpectroML:experiment:sample:sampleDescri 

ption 

Element 

Type 

Description 

•sampleDescriptionld 

ID 

A unique  ID  for  the  sample  description  block. 

sampleDesignation 

- 

The  designation  section  contains  the  sample’s 
designation,  owner  location,  and  handling  method. 

sample Preparation 

- 

The  preparation  section  contains  the  sample’s 
preparation  method  or  source,  operator,  and  date. 

SpectroML:experiment:sample:sampleProperty 

Element 

Type 

Description 

•sample Property Id 

ID 

A unique  ID  for  the  instrument  property  block. 

sampleAt tribute 

The  attribute  section  contains  the  sample's 
inherent  properties. 

1 1 

sampleParameter 

The  parameter  section  contains  the  sample’s 
adjustable  properties. 
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SpectroML:experiment:measurement:measurementDescription 

Element 

Type 

Description 

•measurementDe script ion Id 

ID 

A unique  ID  for  the  measurement  description  block. 

measurementDesignat ion 

- 

The  designation  section  contains  the 
measurement’s  designation,  reference,  and  owner. 

measurementExecut ion 

- 

The  execution  section  contains  the  measurement’s 
project,  operator,  and  date. 

SpectroML:experiment:measurement:measurementProperty 

Element 

Type 

Description 

•measurement Property Id 

ID 

A unique  ID  for  the  measurement  property  block. 

measurement Parameter 

- 

The  parameter  section  contains  the 
measurement’s  adjustable  parameters. 

measurement Correct ion 

- 

The  correction  section  contains  the  measurement’s 
correction  procedures. 

SpectroML:experiment:data:dataProperty 

Element 

Type 

Description 

•dataPropertyld 

ID 

A unique  ID  for  the  data  property  block. 

dataParameter 

- 

The  parameter  section  contains  the  data's 
parameters  for  proper  visualization. 

dataCalculation 

- 

The  calculation  section  contains  the  data's 
parameters  for  calculating  the  actual  result  values. 

SpectroML:experiment:data:dataCore 

Element 

Type 

Description 

•dataCoreld 

ID 

A unique  ID  for  the  data  core  block. 

oovalues 

doubles 

Holds  a list  of  data  values  of  one  dimension 
separated  by  a whitespace. 

•dim 

string 

The  name  of  the  dimension. 
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SpectroML  Elements 


SpectroML:instrument:instrumentDescription:instrumentDesignation 

Element 

Type 

Description 

identifier 

string 

A unique  character  string  or  number  to  identify  the 
instrument  within  the  owner’s  organization.  If 
available,  it  should  be  the  barcode  of  the  serial 
number  on  the  instrument. 

manufacturer 

string 

The  name  of  the  manufacturer  of  the  instrument.  In 
the  case  of  an  instrument  built  by  the  owner’s 
organization  itself,  this  could  be  the  name  of  the 
responsible  group  or  person. 

model 

string 

The  model  name  of  the  instrument  as  it  appears  on 
the  instrument  or  in  its  manual.  In  case  this  is  a 
special  version  or  has  special  equipment,  this 
could  be  listed  after  the  name. 

owner 

- 

The  owner  is  the  public  agency  or  authority,  group, 
corporation,  partnership,  or  individual,  who  owns 
the  instrument. 

— >name 

string 

Full  name  of  the  owner  as  it  would  appear  on  a 
written  document. 

^contact 

string 

Eligible  contact  information,  such  as  phone 
number,  mail  address,  or  email  address. 

location 

— >name 

^contact 

string 

string 

The  physical  location  of  the  instrument  within  the 
owner’s  organization. 

Complete  name  of  the  location  as  it  appears  on  the 
room  sign. 

Eligible  contact  information  for  the  room  or  the 
person  who  is  responsible  for  the  room,  such  as 
phone  number,  mail  address,  or  email  address. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 
block. 
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SpectroML  Elements 


SpectroML:instrument:instrumentDescription:instrumentApplication 

Element 

Type 

Description 

software 

string 

The  name  of  the  software  used  to  control  the 
instrument  and  collect  the  data. 

version 

string 

The  version  of  the  controlling  software,  including 
add-ins  or  service-packs. 

operatingSystem 

string 

The  name  and  version  of  the  operating  system  on 
which  the  instrument’s  software  runs,  including 
add-ins  and  service-packs. 

firmware 

string 

The  revision  level  of  the  software  in  the  instrument 
itself,  e.g.,  its  BIOS  revision. 

operator 

- 

The  operator  of  the  application  is  the  person  whose 
computer  account  is  used  for  running  the  software. 

— >name 

string 

Full  name  of  the  operator  as  it  would  appear  on  a 
written  document. 

^contact 

string 

Eligible  contact  information,  such  as  phone 
number,  mail  address,  or  email  address. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 

block. 
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SpectroML  Elements 


SpectroML:instrument:instrumentProperty:instrumentSetting 

Element 

Type 

Description 

The  minimum  increment  available  for  the 

resolution 

double 

independent  variable  or  fineness  of  detail  reported 
for  the  dependent  variable. 

•unit 

string 

The  unit  of  the  resolution,  e.g.,  ‘nm’. 

linearDispersion 

double 

The  linear  distance  that  light  is  dispersed  in  the 
plane  of  the  exit  slit  per  unit  wavelength. 

•unit 

string 

The  unit  of  the  dispersion,  e.g.,  'mm/nm'. 

spectralBandWidthRange 

- 

The  wavelength  interval  of  radiant  energy  leaving 
the  exit  slit  measured  at  half  the  peak  detected 
power. 

— »min 

double 

The  shortest  wavelength  of  bandwidth  range. 

•unit 

string 

The  unit  of  the  minimum,  e.g.,  ‘nm’. 

— )raax 

double 

The  longest  wavelength  of  bandwidth  range. 

•unit 

string 

The  unit  of  the  maximum,  e.g.,  ‘nm’. 

wavelengthRange 

- 

The  range  of  wavelength  coverage  of  which  an 
instrument  is  capable. 

— »min 

double 

The  shortest  wavelength  of  the  range. 

•unit 

string 

The  unit  of  the  minimum,  e.g.,  ‘nm’. 

-4max 

double 

The  longest  wavelength  of  the  range. 

•unit 

string 

The  unit  of  the  maximum,  e.g.,  ‘nm’. 

absorbanceRange 

- 

The  range  of  absorbance  coverage  of  which  an 
instrument  is  capable. 

— >min 

double 

The  smallest  absorbance  of  the  range. 

•unit 

string 

The  unit  of  the  minimum,  e.g.,  ‘nm’. 

— »max 

double 

The  largest  absorbance  of  the  range. 

•unit 

string 

The  unit  of  the  maximum,  e.g.,  ‘nm’. 

detectorTypes 

string 

The  types  or  names  of  the  detector  used  for 
measuring. 

sourceTypes 

string 

The  types  or  names  of  the  light  source. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 

block. 
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SpectroML  Elements 


SpectroML:instrument:mstrumentProperty:instrumentParameter 

Element 

Type 

Description 

slitWidth 

double 

The  physical  width  of  the  slit  of  the  wavelength 
selection  device. 

•unit 

string 

The  unit  of  the  spectral  slitwidth,  e.g.,  ‘mm’. 

spectralSlitWidth 

double 

The  effective  spectral  bandwidth  of  the  wavelength 
selection  device  as  defined  by  the  physical  slit- 
width divided  by  the  linear  dispersion. 

•unit 

string 

The  unit  of  the  spectral  slitwidth,  e.g.,  ‘nm’. 

beamChannel 

string 

The  beam  channel  used  in  the  instrument. 

sampleHolder 

string 

The  type  or  name  of  the  sample  holder  unit. 

sample Posit ion 

string 

The  position  of  the  sample  within  the  sample 
holder  unit. 

scanSpeed 

double 

The  speed  of  the  scan  per  time  interval  or  for  the 
whole  scan. 

•unit 

string 

The  unit  of  the  speed,  e.g.,  ‘nm/s’. 

integrationTime 

double 

The  amount  of  time  used  to  measure  at  one 
specific  wavelength. 

•unit 

string 

The  unit  of  the  speed,  e.g.,  ‘ms’. 

point Separation 

double 

The  spacing  between  two  wavelength  values  in 
case  a whole  spectrum  is  measured. 

•unit 

string 

The  unit  of  the  separation,  e.g.,  ‘nm’. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 
block. 

Version  1.01  [April  2002] 


16 


SpectroML 


SpectroML  Elements 


SpectroML:sample:sampleDescription:sampleDesignation 

Element 

Type 

Description 

identifier 

string 

A unique  character  string  or  number  to  identify  the 
sample  within  the  owner’s  organization.  If 
available,  it  should  be  the  bar  code  on  the  sample. 

name 

string 

A common  or  trade  name  for  the  sample  or  a name 
according  to  a sample  index. 

owner 

- 

The  public  agency  or  authority,  group,  corporation, 
partnership,  or  individual  who  owns  the  sample. 

->name 

string 

Full  name  of  the  owner  as  it  would  appear  on  a 
written  document. 

— >contact 

string 

Eligible  contact  information,  such  as  phone 
number,  mail  address,  or  email  address. 

location 

- 

The  physical  location  of  the  sample  within  the 
owner’s  organization. 

— >name 

string 

Complete  name  of  the  location  as  it  appears  on  the 
room  sign. 

Eligible  contact  information  for  the  room  or  the 

-^contact 

string 

person  who  is  responsible  for  the  room,  such  as 
phone  number,  mail  address,  or  email. 

casNumber 

string 

The  registry  number  of  the  sample  compound 
according  to  the  Chemical  Abstracts  Service  Index. 

Molecular  formula  of  the  compound  in  the  common 

formula 

string 

notation.  Elemental  symbols  should  be  arranged 
with  carbon  first,  followed  by  hydrogen,  and  then 
remaining  element  symbols  in  alphabetic  order. 

storageMethod 

string 

The  name  or  description  of  the  method  to  store  the 
sample. 

disposalMethod 

string 

The  name  or  description  of  the  method  to  dispose 
of  the  sample. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 

block. 
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SpectroML  Elements 


SpectroML:sample:sampleProperty:samplePreparation 

Element 

Type 

Description 

procedureMethod 

string 

The  name  or  short  description  of  the  procedure 
used  to  prepare  the  sample. 

timeStamp 

- 

The  date  and  time  when  the  sample  was  prepared 
or  purchased. 

— >date 

date 

The  date  of  the  timestamp. 

— >time 

time 

The  day  of  the  timestamp. 

operator 

- 

The  person  who  was  responsible  for  the  sample's 
preparation. 

— >name 

string 

Full  name  of  the  operator  as  it  would  appear  on  a 
written  document. 

^contact 

string 

Eligible  contact  information,  such  as  phone 
number,  mail  address,  or  email  address. 

supplier 

- 

The  organization,  from  which  the  sample  was 
acquired. 

— >name 

string 

Full  name  of  the  supplier  as  it  would  appear  on  a 
written  document. 

— Contact 

string 

Eligible  contact  information,  such  as  phone 
number,  mail  address,  or  email  address. 

procedureDescription 

string 

A detailed  description  of  the  procedure  method 
used  to  prepare  the  sample. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 
block. 
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SpectroML  Elements 


SpectroML:sample:sampleDescription:sampleAttribute 

Element 

Type 

Description 

molecularWeight 

double 

The  molecular  weight  of  the  sample. 

•unit 

string 

The  unit  of  the  molecular  weight,  e.g.,  ‘atomic 
mass  unit’. 

meltingPoint 

double 

The  melting  point  temperature  of  the  sample. 

•unit 

string 

The  unit  of  the  temperature,  e.g.,  ‘°C’. 

boilingPoint 

double 

The  boiling  point  temperature  of  the  sample. 

•unit 

string 

The  unit  of  the  temperature,  e.g.,  loC’. 

density 

double 

The  density  of  the  sample. 

•unit 

string 

The  unit  of  the  density,  e.g.,  ‘g/ml_’. 

ref r active Index 

double 

The  refractive  index  of  the  sample. 

The  unit  of  the  refractive  index,  usually  a ratio 

•unit 

string 

relative  to  air  at  a specific  temperature  and 
wavelength. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 
block. 

SpectroML:sample:sampleProperty:sampleParameter 

Element 

Type 

Description 

state 

string 

The  state  of  the  sample  at  the  time  of  measuring. 
Usually  that  is  ‘gas’,  ‘solid’,  or  ‘liquid’,  but  it  is  also 
possible  to  specify  other  states. 

pathLength 

double 

The  distance  traveled  by  the  light  beam  through 
the  sample,  usually  the  internal  width  of  the  sample 
holder. 

•unit 

string 

The  unit  of  the  distance,  e.g.,  ‘cm’. 

amount 

double 

The  amount  of  the  sample.  This  could  be  a volume 

or  a mass. 

•unit 

string 

The  unit  of  the  amount,  e.g.,  ‘ml_’. 

pressure 

double 

The  pressure  of  the  sample  at  the  time  of 
measuring. 

•unit 

string 

The  unit  of  the  pressure,  e.g.,  ‘Pa’. 

temperature 

double 

The  current  temperature  of  the  sample  at  the  time 
of  measuring. 

•unit 

string 

The  unit  of  the  temperature,  e.g.,  ‘°C’. 

humidity 

double 

The  current  humidity  of  the  sample  at  the  time  of 
measuring. 

•unit 

string 

The  unit  of  the  humidity,  e.g.,  “%’. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 
block. 
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SpectroML  Elements 


SpectroML:measurement:measurementDescription:measurementDesignation 

Element 

Type 

Description 

identifier 

string 

A unique  character  string  or  number  to  identify  the 
measurement  within  the  owner's  organization.  It 
can  be  a barcode  or  an  entry  number  in  a 
laboratory  notebook. 

title 

string 

The  title  of  the  experiment  part  as  it  would  appear 
on  a written  document. 

owner 

— >name 

— ^contact 

string 

string 

The  public  agency  or  authority,  group,  corporation, 
partnership,  or  individual  who  owns  the 
measurement  results. 

Full  name  of  the  owner  as  it  would  appear  on  a 
written  document. 

Eligible  contact  information,  such  as  phone 
number,  mail  address,  or  email  address. 

labor at orvRef erence 

string 

A reference  to  any  other  documentation  method  of 
the  measurement,  e.g.,  a laboratory  notebook. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 
block. 

SpectroML:measurement:measurementDescription:measurementExecution 

Element 

Type 

Description 

proj  ect 

string 

The  name  of  the  project  within  the  owner’s 
organization  to  which  the  measurement  belongs. 

timeStamp 

- 

The  date  and  time  when  the  measurement  was 
performed. 

— >date 

date 

The  date  of  the  timestamp. 

— »time 

time 

The  day  of  the  timestamp. 

operator 

- 

The  person  who  performed  the  experiment. 

— >name 

string 

Full  name  of  the  operator  as  it  would  appear  on  a 
written  document. 

— >contact 

string 

Eligible  contact  information,  such  as  phone 
number,  mail  address,  or  email  address. 

A comment  provides  the  opportunity  to  include 

comment 

string 

additional  human-readable  information  about  this 
block. 

' 
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SpectroML  Elements 


SpectroML:measurement:measurementProperty:measurementParameter 

Element 

Type 

Description 

measurementType 

string 

The  type  of  the  measurement,  e.g.,  a sample 
measurement  or  blank  measurement. 

scanMode 

string 

The  name  of  the  experiment  mode,  e.g.,  measuring 
discrete  wavelengths  or  measuring  a spectrum. 

ref erence Sample 

string 

The  reference  sample  used  in  the  measurement. 

•sampleDescriptionLink 

string 

The  ID  of  a sample  description  block  of  the 
reference  sample  or  any  other  reference  which 
links  to  it. 

filter 

string 

The  name  or  type  of  the  filter  used  in  the 
measurement  to  exclude  certain  wavelengths. 

signalNoise 

string 

The  name  or  type  of  the  signal-to-noise  processing 
used  for  correction. 

scanNumbers 

unsignedlnt 

The  number  of  scans  used  to  average  the  final 
value. 

scanDuration 

double 

The  total  amount  of  time  to  collect  all  data  for  this 
measurement. 

•unit 

string 

The  unit  of  the  time,  e.g.,  ‘s’. 

comment 

string 

A comment  provides  the  opportunity  to  include 
additional  human-readable  information  about  this 
block. 
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SpectroML  Elements 


SpectroML:measurement:measurementProperty:measurementCorrection 

Element 

Type 

Description 

qualif i cat ionTimeS tamp 

- 

The  date  and  time  when  the  instrument  was 
qualified,  usually  after  purchase  or  major  upgrade. 

— >date 

date 

The  date  of  the  timestamp. 

— >time 

time 

The  day  of  the  timestamp. 

qualif icationRef erence 

string 

A reference  to  a file  or  other  document  which 
contains  the  data  for  the  qualification  test. 

prof iciencyTimeSt amp 

- 

The  date  and  time  when  a measurement  against  a 
standards  institution  (e.g.,  NIST)  was  performed. 

— >date 

date 

The  date  of  the  timestamp. 

— >time 

time 

The  day  of  the  timestamp. 

prof iciencyRef erence 

string 

A reference  to  a file  or  other  document  which 
contains  the  data  for  the  proficiency  test. 

transmittanceTimeStamp 

- 

The  date  and  time  when  the  transmittance  linearity 
was  checked. 

— »date 

date 

The  date  of  the  timestamp. 

— >time 

time 

The  day  of  the  timestamp. 

A reference  to  a file  or  other  document  which 

transmittanceRef erence 

string 

contains  the  data  for  the  transmittance  linearity 
test. 

wavelengthTimeStamp 

- 

The  date  and  time  when  the  wavelength  calibration 
was  performed. 

— >date 

date 

The  date  of  the  timestamp. 

— >time 

time 

The  day  of  the  timestamp. 

wave lengthRef erence 

string 

A reference  to  a file  or  other  document  which 
contains  the  data  for  the  wavelength  calibration. 

A comment  provides  the  opportunity  to  include 

comment 

string 

additional  human-readable  information  about  this 
block. 
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SpectroML  Elements 


SpectroML:data:dataProperty:dataParameter 

Element 

Type 

Description 

axisLabel 

- 

Contains  the  labels  for  each  axis,  as  they  would 
appear  in  a graph. 

-)»axis 

string 

The  name  of  the  label  for  one  axis. 

•dim 

string 

The  name  of  the  dimension  of  the  axis. 

axisUnit 

- 

Contains  the  units  of  the  data  values  for  each  axis. 

— »°°axis 

string 

The  name  of  the  unit  for  one  axis. 

•dim 

string 

The  name  of  the  dimension  of  the  axis. 

Contains  the  minimum  values  for  each  axis,  as  set 

minimumValue 

- 

by  the  operator  or  measured  by  the  instrument. 

This  can  be  used  during  the  visualization  process 
to  adjust  the  size  of  the  graph. 

— >»value 

double 

The  minimum  value  for  one  axis. 

•dim 

string 

The  name  of  the  dimension  of  the  axis. 

Contains  the  maximum  values  for  each  axis,  as  set 

max i mumV a 1 ue 

- 

by  the  operator  or  measured  by  the  instrument. 

This  can  be  used  during  the  visualization  process 
to  adjust  the  size  of  the  graph. 

— >°°value 

double 

The  maximum  value  for  one  axis. 

•dim 

string 

The  name  of  the  dimension  of  the  axis. 

A comment  provides  the  opportunity  to  include 

comment 

string 

additional  human-readable  information  about  this 
block. 
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SpectroML:data:dataProperty:dataCalculation 

Element 

Type 

Description 

Contains  the  factors,  by  which  the  dataset  values 

scaleFactor 

must  be  multiplied  to  get  the  actual  values.  This  is 
used  to  have  convenient  numbers  in  the  data  core. 

— >°°value 

double 

The  scale  factor  for  one  axis. 

•dim 

string 

The  name  of  the  dimension  of  the  axis. 

The  total  number  of  points  in  the  dataset.  A point  is 

number Points 

unsignedlnt 

a set  of  numbers  belonging  to  one  measured 
value. 

point  Increment 

- 

Contains  the  fixed  increment  along  each  axis  from 
one  point  to  the  next  point. 

— >°°value 

double 

The  point  increment  for  one  axis. 

•dim 

string 

The  name  of  the  dimension  of  the  axis. 

Contains  the  start  value  along  each  axis.  When 
values  are  evenly  spaced,  they  can  be  calculated 

startValue 

by  using  a fixed  increment  value  and  a start  value, 
so  they  do  not  have  to  appear  in  the  data  core 
section. 

lvalue 

double 

The  start  value  for  one  axis. 

•dim 

string 

The  name  of  the  dimension  of  the  axis. 

A comment  provides  the  opportunity  to  include 

comment 

string 

additional  human-readable  information  about  this 
block. 
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SpectroML  Files 

The  structure  and  its  elements  were  transformed  into  an  XML  structure  and  its  related 
documents.  They  are  fully  contained  within  this  document  (and  can  be  downloaded  from  the 
project  page  http://www.csti.nist.aov/nist839/839.04/index.htmi)  and  are  ready  to  use.  They  can  be 
viewed  and  edited  with  any  common  text  editor;  however,  to  demonstrate  their  functionality, 
some  tools  are  required: 

• a browser  that  can  process  a DTD 

(e.g.,  MS  Internet  Explorer  5.5,  http://www.microsoft.com/windows/ie/default.htm) 

• a parser  that  can  process  XML  Schemas 
(e.g.,  Apache  Xerces,  http://xml.apache.org) 

• a browser  that  can  process  an  XML  Stylesheet 

(e.g.,  MS  Internet  Explorer  6.0,  http://www.microsoft.com/windows/ie/default.htm) 

• a programming  package  to  handle  XML  documents  in  applications 
(e.g.,  Hunter/McLaughlin’s  JDOM,  http://www.idom.org) 

The  SpectroML  sample  file,  DTD,  Schema,  and  Stylesheet  are  accessible  on  the  Internet  at 
the  site  hosted  by  XML.org  (http://www.xmi.org). 

Sample  file 

A sample  file  was  created  that  uses  all  the  elements  of  SpectroML.  It  is  based  on  a real 
measurement  of  a sample  of  tap  water  using  an  HP  8453  diode  array  spectrophotometer  at 
three  specific  wavelengths.  The  file  was  then  completed  with  information  from  the  laboratory 
and  the  literature,  and,  where  necessary,  populated  with  some  arbitrary  values. 

[see  Appendix  C for  the  code  listing] 

The  current  sample  file  is  available  at:  http://www.xml.org/xml/schema/2c09ac55/Specfro/l7L.xml 
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Document  Type  Definition 

A document  type  definition  (DTD)  makes  it  possible  to  define  precisely  the  structure  of  an  XML 
document  and  to  prove  whether  or  not  it  belongs  to  a certain  type.  The  DTD  is  part  of  the  XML 
specification  (http://www.w3.orq/TR/2000/REC-xml-20001006). 

A DTD  for  SpectroML  has  been  defined,  and  a SpectroML  document  using  it  must  refer  to  it  by 
adding  the  following  line  (which  must  be  placed  below  the  XML  header  line): 

< ! DOCTYPE  SpectroML  SYSTEM  " http : //www . xml . org/xml/schema/2c09ac55/SpectroML . dtd" > 

The  URL  (Universal  Resource  Locator)  refers  to  the  SpectroML  DTD  located  in  the  repository 
at  the  XML.org  site. 

The  SpectroML- DTD  is  structured  as  follows: 

• The  elements  must  be  defined  in  the  order  of  their  hierarchy  level. 

• Within  each  structure  level,  elements  are  listed  alphabetically. 

• The  elements  have  a defined  order. 

• All  elements  are  set  as  optional,  except  for  the  header  information  in  the  file  group. 

• All  attributes  of  elements  are  set  as  mandatory. 

• The  blocks  are  identified  with  IDs  and  referred  to  via  IDREFs. 

To  validate  the  SpectroML  document  against  its  DTD,  a parser  must  read  the  document  in 
validation  mode.  Those  browsers  that  can  display  XML  files  are  capable  of  doing  this. 

The  DTD  mechanism  has  some  drawbacks:  for  example,  it  does  not  permit  an  arbitrary 
element  order  without  explicit  specification  of  each  of  the  different  possibilities.  This  is,  in  most 
cases,  not  practical  due  to  the  huge  number  of  possible  permutations.  Furthermore,  the  DTD 
concept  does  not  support  different  datatypes-the  only  data  element  type  is  character  data.  But 
since  the  DTD  remains  the  standard  mechanism  to  define  document  types  for  XML,  a DTD  for 
SpectroML  is  maintained  as  well. 

XML  Schema  is  another  way  to  define  XML  document  types,  and  its  approach  is  considerably 
more  flexible.  Accordingly,  we  developed  a schema  for  SpectroML.  It  is  important  to  realize 
that  a document  valid  against  a schema  may  not  necessarily  be  valid  against  the 
corresponding  DTD. 

The  following  file  contains  the  document  type  definition  of  SpectroML  as  described  above: 

[see  Appendix  C for  the  code  listing] 

The  current  SpectroML  DTD  is  also  available  at: 

http://www.xml.orq/xml/schema/2c09ac55/Sioecfrc>/W/_.dtd 
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Schema 

The  XML  Schema  specification  was  recently  released  by  the  W3C  (World  Wide  Web 
Consortium)  (http://www.w3.org/TR/xmischema-0)  and  may  replace  the  DTD  concept  in  future. 
However,  DTD  is  part  of  the  XML  specification,  whereas  XML  Schema  is  not.  Therefore  a 
schema  must  be  specified  in  the  root  element  via  the  namespace  mechanism 
(http://www.w3.orgT~R/REC-xmi-names/).  To  validate  the  document  against  the  schema,  a parser 
must  read  the  document  in  validating  mode  and  be  capable  of  dealing  with  XML  schema.  In 
the  future,  most  browser  versions  should  support  this. 

The  root  element  of  SpectroML  looks  like  this: 

<SpectroML  xralns : xsi="http : //www . w3 . org/2001/XMLSchema- instance " 

xsi :noNamespaceSchemaLocation="http: / /www . xml . org/xml/ schema/ 2c09ac55 / SpectroML . xsd" > 

This  declares  that  the  instance  of  the  SpectroML  document  type  is  specified  without  its  own 
namespace  and  uses  the  2001  XML  Schema  version.  The  URL  refers  to  the  SpectroML 
schema  located  in  the  repository  at  the  XML.org  site. 

Using  a schema  provides  several  advantages  over  the  DTD  mechanism: 

• It  is  a regular  XML  file  itself,  using  a set  of  tags  to  describe  the  document  type  without 
special  mechanisms  like  a DTD;  this  makes  it  parseable  like  a SpectroML  file. 

• It  permits  the  specification  of  different  datatypes,  for  example  measured  values  can  be  of 
type  float  instead  of  a general  CDatatype;  typing  allows  some  checking  of  element  content. 

• It  permits  the  specification  of  elements  that  may  occur  in  any  order;  for  example,  elements 
within  a section  do  not  need  to  appear  in  a strict  order,  which  is  sensible,  since  they  are 
already  tagged  and  contained  within  a defined  structure. 

However,  due  to  the  powerful  type  structure,  schema  files  are  more  verbose  and  physically 
much  larger  than  DTDs. 

The  SpectroML  schema  was  based  on  the  SpectroML  DTD  and  uses  the  same  structure.  The 
following  new  features  were  added: 

• Sections  within  a block  can  appear  in  any  order. 

• Elements  within  a section  can  appear  in  any  order. 

• Datatypes  were  applied  to  the  elements. 

• The  data  values  for  each  dimension  have  a list  type. 

[see  Appendix  C for  the  code  listing] 

The  current  SpectroML  schema  is  available  at: 

http://www.xml.Org/xml/schema/2c09ac55/SpecfroM/-.xsd 
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Stylesheet 

Originally,  stylesheets  were  used  to  format  a document  for  different  types  of  output  media,  so 
that  one  document  could  be  used  with  several  stylesheets,  for  example,  a different  one  each 
for  a web  site,  a handout,  or  a book  to  provide  an  output  appropriate  to  the  medium. 

XML  stylesheets  have  their  own  powerful  language  (XSLT)  based  on  an  XML  tag  set.  XSL  is  a 
transformation  language  that  takes  a source  XML  document  and  transforms  it  using  a set  of 
rules  into  a target  document.  This  is  extensively  used  when  transferring  data  between  two 
systems.  The  XSLT  specification  is  available  at:  http://www.w3. orqSTR/xsit.htmi 

A commonly  used  XSL  application  is  the  transformation  of  an  XML  document  into  an  HTML 
document  to  display  the  XML  content  in  a better  way.  A stylesheet  was  developed  to  view 
SpectroML  documents  in  a browser  or  another  XSLT  processor,  which  adds  a line  in  the  XML 
file  that  refers  to  it: 

< ?xml - stylesheet  type= " text/xsl " 

href  = "http:  //www.xml . org/xml  /schema/2  c09ac5  5 /Spect  roML  . xsl"  ? > 

The  URL  refers  to  the  SpectroML  stylesheet  located  in  the  repository  at  the  XML.org  site. 

Some  recent  browser  versions  are  capable  of  processing  an  XML  schema,  and  this  might 
become  a standard  browser  feature  in  future.  For  example,  opening  an  XML  document  usually 
yields  the  display  of  the  tree  structure  of  the  file;  however,  if  a stylesheet  is  assigned  to  the 
document,  its  output  could  be  displayed  instead. 

The  stylesheet  for  SpectroML  does  the  following: 

• It  lists  all  elements  within  the  file. 

• It  groups  them  by  experiment,  group,  block,  and  section. 

• Attribute  values  are  listed  in  round  brackets. 

• IDs  of  experiments,  blocks,  and  paths  are  listed  in  square  brackets. 

• Datasets  are  not  listed  separately,  but  since  the  IDs  and  paths  are  displayed,  one  can  see 
which  blocks  belong  together. 

• Data  points  are  listed  like  regular  elements  as  they  appear  in  the  file,  so  there  is  no  special 
processing  or  visualization  for  them. 

To  enhance  the  stylesheet,  mainly  to  visualize  the  data  elements,  further  programming  and  a 
much  more  complicated  stylesheet  would  be  necessary.  This  will  likely  be  done  as  part  of  the 
future  application  development  for  SpectroML. 

[see  Appendix  C for  the  code  listing ] 

The  current  SpectroML  stylesheet  is  available  at: 

http://www.xml.org/xml/schema/2c09ac55/Spec/ro/WL.xsl 
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SpectroML  Applications 

There  are  manifold  possibilities  for  useful  applications  (Figure  4)  for  SpectroML  such  as: 


• a converter  that  transfers  data  from  another  format  into  SpectroML  and  vice  versa; 

• an  editor  that  adds  further  information  to  a SpectroML  file  manually  that  cannot  be  put  in 
automatically  or  that  can  construct  SpectroML  files  easily  when  no  automation  is  available; 

• an  enhanced  stylesheet  that  displays  both  metadata  and  data  in  a convenient  way; 

• a virtual  library  that  stores  performed  experiments  and  provides  for  queries  through  a web 
portal; 

• a viewer  that  displays  SpectroML  with  user-definabie  views; 

• a plug-in  for  office  software  that  assists  in  getting  experiment  data  into  spreadsheets, 
presentations,  or  paper  documents; 


a database  application  that  receives  SpectroML  files  for  storage  and  then  retrieves  and 
transmits  them  on  request. 
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Figure  4 - Some  SpectroML  applications 

(1:  visualization  stylesheet,  2:  editor  application,  3:  visualization  applet) 
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SpectroML  API 

SpectroML  applications  need  to  perform  one  or  more  of  the  following  operations: 

• read  SpectroML  data  to  display  or  process  them 

• edit  SpectroML  data  to  alter  files  or  build  new  ones 

• write  SpectroML  data  to  transfer  or  store  them. 

For  example,  to  change  an  element  in  a SpectroML  document,  the  following  steps  are 
necessary: 

• read  the  XML  file 

• parse  the  file  and  validate  it 

• step  through  the  XML  document  to  find  a certain  location 

• change  the  content  of  an  element 

• rebuild  the  XML  document 

• store  the  XML  file. 

This  procedure  requires  an  extensive  amount  of  programming  code  each  time  one  deals  with  a 
SpectroML  file.  To  minimize  the  programming  effort  for  each  new  application,  it  would  be  very 
convenient  to  have  a toolkit  that  provides  abstract  functions: 

• open  ("file. xml") 

• change  (element,  content) 

• store  (“new  xml  file”). 

We  are  building  a toolkit  in  the  form  of  an  API  (application  program  interface)  that  provides  a 
number  of  functions  to  work  with  SpectroML  files,  their  metadata,  and  data. 
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Digital  signatures 

In  many  laboratory  environments,  it  is  essential  to  demonstrate  the  integrity  of  experimental 
data  to  ensure  that  any  manipulation  or  tampering  can  be  detected.  SpectroML  files  are 
regular  ASCII  (American  Standard  Code  for  Information  Interchange)  text,  and  therefore  it  is 
easy  to  alter  the  data,  either  intentionally  or  not.  Even  when  there  is  no  need  for  completely 
secure  data,  there  often  is  a need  to  establish  the  origin  of  the  data  and  its  subsequent  history. 

In  a laboratory  notebook,  one  certifies  a dataset  with  a written  signature.  In  a similar  fashion, 
computerized  datasets  can  be  "signed"  by  enclosing  them  with  a digital  signature.  There  are 
several  mechanisms  to  do  this,  but  basically,  they  all  have  an  algorithm  that  calculates  a 
unique  byte  sequence  (a  signature)  based  on  the  content  of  the  file  itself.  This  sequence  is 
delivered  together  with  the  data  file  and  a recipient  can  validate  it,  as  long  as  he/she  knows  the 
algorithm.  If  an  element  in  a signed  file  were  changed  after  applying  the  signature,  the 
subsequent  validation  would  fail. 

XML  provides  a mechanism  for  digital  signatures.  It  is  not  yet  officially  released,  but  it  is  fully 
operable  (http://www.w3. org/TR/xmidsiq-core/).  The  following  are  its  main  features: 

• A signature  element  contains  all  information  about  the  validation  process. 

• The  signature  element  can  either  become  a part  of  the  XML  document  that  it  signs,  or  it 
can  be  put  into  a separate  file. 

• The  signature  element  refers  to  an  XML  document  or  object  and  specifies  the  methods  of 
validation. 

• A signature  value  contains  the  calculated  digital  signature. 

SpectroML  has  no  built-in  tags  for  signatures,  but  it  can  be  signed  via  the  mechanism  provided 
by  the  XML  Signature  routine. 
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Conclusion 

SpectroML  is  still  largely  a proposal.  Even  though  SpectroML  is  ready  for  use,  it  has  yet  to  be 
tested  in  practice.  We  are  now  beginning  to  write  applications  to  use  SpectroML.  Therefore, 
reviewer  comments  and/or  suggestions  are  solicited  and  will  be  highly  appreciated 
[gary.kramer@nist.gov]. 

Now  that  the  SpectroML  structure  and  its  elements  together  with  its  DTD/Schema  are  in  hand, 
anyone  can  use  SpectroML.  All  that  is  needed  is  a text  editor  and  some  of  the  many  free  tools 
available  on  the  Internet.  To  utilize  SpectroML  in  an  application,  one  can  use  an  XML  API  or 
the  soon  to  be  available  SpectroML  API. 

At  present,  SpectroML  is  focused  on  UV/Vis  spectroscopy.  But  its  structure  and  its  flexible  data 
model  should  make  it  easily  adaptable  to  other  fields  of  spectroscopy.  Our  ultimate  goal  is  to 
build  SpectroML  into  a standard  that  will  benefit  everyone  who  deals  with  spectrometric  data. 
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Appendix  A:  Analysis  of  Existing  UV/Vis  Data  Formats 

Introduction 

Our  decision  to  create  SpectroML  did  not  mean  starting  from  scratch.  There  was  simply  no 
need  to  do  this.  We  believed  that  the  terminology,  data  dictionaries,  and  concepts  embodied 
in  existing  standards,  instrument  software,  and  data  interchange  formats  could  be  leveraged  to 
facilitate  the  development  of  SpectroML.  We  wanted  to  take  advantage  of  the  large  body  of 
work  that  has  been  done  in  the  field  of  spectrometric  data  interchange  rather  than  re-inventing 
it.  With  this  concept  of  reuse  firmly  in  hand,  we  studied  terminology  definitions  in  normative 
standards  [4],  spectrometer  operation  and  software  manuals,  and  existing  native  and 
interchange  formats  in  hopes  of  extracting  the  most  useful  parts  of  each. 

There  are  many  different  data  formats  for  analytical  data  interchange.  Each  of  them  has  a 
different  viewpoint  and  emphasis,  and  each  concentrates  on  different  elements.  Therefore,  the 
first  step  in  developing  a markup  vocabulary  for  UV/Vis  spectroscopy  data  interchange  was  to 
analyze  these  formats  and  to  extract  their  most  useful  parts. 

For  this  document  three  interchange  formats  were  selected  for  study: 

• GRAMS  SPC  (Galactic  Industries  Corp.,  9/97)  [1] 

(http://www.qalactic.com/instruments/spc.htm) 

• JCAMP-DX  (Joint  Committee  on  Atomic  and  Molecular  Physics  Data  Exchange,  9/87)  [2] 

(http://www.isas-dortmund.de/proiects/icamp/protocol.html) 

• ANDI/NetCDF  (ASTM  E2077,  E2078,  3/00)  [3] 

(http://enterprise.astm.orq/PAGES/E2077.htm) 

The  documents  were  examined,  and  all  UV/Vis  related  items  were  extracted  and  compared. 
This  revealed  similarities  and  differences  in  the  approaches  used  to  store  the  data  (data 
values)  and  metadata  (descriptive  elements  concerning  the  data).  Combining  the  best  from 
each  format  provided  a good  starting  vocabulary  and  structure  for  the  development  of 
SpectroML. 


i 

I 
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GRAMS  SPC 

SPC  is  a format  developed  by  Galactic  Industries  Corp.  that  is  used  internally  in  their  products 
and  as  an  exchange  format  between  applications.  It  is  designed  for  a variety  of  different  types 
of  data  taken  by  laboratory  analytical  instrumentation.  SPC  consists  of  a header  describing  the 
content  of  the  file,  followed  by  a binary  storage  area  for  the  instrumentation  data,  and  an 
optional  block  for  storing  additional  information. 

The  header  and  the  data  area  are  in  binary  format;  this  means  that  all  the  elements  are  in  a 
defined  order  and  each  element  has  a fixed  length.  The  log  block  is  a text  block,  where  the 
elements  are  defined  as  keys  and  their  values.  This  section  has  a number  of  predefined  keys 
for  different  types  of  data  files,  but  is  open  for  user-defined  keys,  as  well. 

The  structure  of  the  data  area  depends  on  the  type  of  data.  SPC  distinguishes  the  following 
types; 

• single  evenly  (one  spectrum,  evenly  spaced  X values) 

■=>  X values  calculated,  one  block  for  the  Y values 

• multi  evenly  (multiple  spectra,  evenly  spaced  X values) 

■=>  X values  calculated,  multiple  blocks  for  Y values 

• single  unevenly  (one  spectrum,  unevenly  spaced  X values) 

■=>  one  block  for  each  X and  Y values 

• multi  unevenly  common  (multiple  spectra,  unevenly  spaced  X values,  same  in  all  spectra) 

<=>  one  block  for  X values,  multiple  blocks  for  Y values 

• multi  unevenly  unique  (multiple  spectra,  unevenly  spaced  X,  different  for  each  spectrum) 

■=>  alternating  blocks  for  X and  Y values  for  each  spectrum. 
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JCAMP-DX 

The  JCAMP-DX  format  was  developed  in  the  late  1980s  and  continues  to  be  used  for  data 
exchange.  It  is  a character-based  format  consisting  of  text  lines,  each  containing  a defined 
keyword  and  its  value.  It  has  a number  of  required  items  (core  data),  followed  by  optional 
information  and  parameters,  and  a data  block  in  various  formats. 

Because  of  its  text-based  format,  each  key  value  contains  a number  of  characters,  but  its  type 
can  be  integer  or  float.  The  elements  do  not  have  to  be  in  a defined  order.  JCAMP  allows 
building  blocks  of  sub-files  for  more  than  one  spectrum. 

The  data  block  can  be  organized  as: 

• a point  list  (XY..XY  or  XYZ...XYZ) 

o data  points  come  in  pairs  or  triplets;  this  is  used  for  unevenly  spaced  X values  or  for 
better  human  readability 

• an  ordinate  list  (X++(Y..Y)) 

^ a line  starts  with  an  X value  and  is  followed  by  a number  of  Y values:  this  is  used  for 
evenly  spaced  X values. 

The  data  themselves  can  appear  in  the  following  formats: 

• fixed  form 

■=>  each  number  has  a fixed  number  of  characters 

• packed  form 

*=>  adjacent  values  are  separated  by  space  or  sign 

• squeezed  form 

^ delimiter,  leading  digit,  and  sign  are  replaced  by  a pseudo-digit 

• difference  form 

^ delimiter,  leading  digit,  and  sign  of  the  difference  between  adjacent  values  are 
represented  by  a pseudo-digit 

• difference  duplicate  form 

^ in  addition  to  difference  form  duplicate  values  are  replaced  by  the  value  and  the  number 
of  its  appearance. 
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ANDI/NetCDF 

ANDI  (Analytical  Data  Interchange)  is  a subset  of  the  NetCDF  (Network  Compound  Document 
Format)  format  and  is  specified  for  both  mass  spectrometry  and  chromatography  data 
interchange.  Unlike  the  SPC  and  JCAMP-DX  formats,  each  ANDI  method  has  its  own  protocol 
and  contains  very  technique-specific  information.  There  is  no  official  protocol  for  UV/Vis  data 
(although  draft  versions  for  infrared  and  diode  array  UV/Vis  spectroscopy  have  been 
circulated),  so  our  analysis  is  based  primarily  on  the  protocol  for  mass  spectrometric  data. 
Much  of  it  proved  relevant  for  spectrophotometric  data. 

The  whole  format  is  defined  in  a C-language-like  structure,  where  each  element  has  a fixed 
type  and  length.  When  this  structure  is  stored  in  a file,  it  becomes  one  binary  block.  The 
elements  are  divided  into  categories,  and  some  of  them  are  required  for  data  completeness. 

The  data  structure  can  contain  a number  of  spectra.  Each  axis  has  an  array  for  its  values.  The 
data  can  be  organized  as: 

• pairs  or  triplets 

<=>  values  at  the  same  position  in  each  array  belong  together;  this  is  used  for  unevenly 
spaced  values 

• single  array 

^ only  the  Y values  are  stored  in  the  array,  the  others  are  calculated;  this  is  used  for 
evenly  spaced  values. 
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UV/Vis  Elements 

All  the  UV/Vis-related  items  gleaned  from  the  formats  we  examined  were  extracted  and 
organized  into  five  groups: 

• File  (header  information) 

• Instrument  (information  about  the  instruments  used) 

• Sample  (information  about  the  processed  samples) 

• Measurement  (information  about  the  measurement  process) 

• Data  (data  values  and  information  about  its  structure) 

Each  group  was  divided  into  sub-groups  and  the  elements  were  listed  together  with  their 
datatypes  and  a description.  The  datatypes  include: 

• values  (integer  and  floating  point) 

• strings  (parseable  strings  and  free  text) 

• items  (values  or  strings  out  of  a defined  list) 

• arrays  (of  the  previous  types). 

This  is  the  collection  of  terms  that  formed  the  basis  for  the  development  of  the  SpectroML- 
UV/Vis  vocabulary  and  structure. 
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File  group 

The  file  group  contains: 

• description 

=>  information  about  the  complete  dataset 


• user 

^ information  about  the  creators  of  the  dataset 


File  description 

GRAMS 

Name  Type  Description 

file  description 

text 

memo  or  comment  that  describes  the  data  in  the  file 

file  format  version 

item 

file  format  version  of  Galactic  SPC  files  [old,  new,  new2] 

name 

string 

data  file  name,  optional  path  and  extension 

JCAMP 

Name  Type  Description 

jcampdx 

string 

version  of  JCAMP-DX 

title 

text 

concise  description  of  the  spectrum 

ANDI 

Name  Type  Description 

admin  comments 

text 

comments  about  the  dataset  identification  of  the  experiment 

dataset  time  stamp 

string 

date  and  time  at  which  the  source  file  was  created  (relative  to  GMT) 

experiment  title 

text 

meaningful  name  of  the  experiment 

languages 

string 

array 

list  of  human  and  programming  languages  delineated  for  processing 

netcdf  revision 

string 

revision  level  of  NetCDF  data  interchange  system 

source  file  reference 

string 

adequate  information  to  locate  the  original  dataset 

source  file  version 

string  version  of  the  data  file  format 
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File  user 

GRAMS 

Name 

Type 

Description 

user 

text 

user’s  or  analyst’s  name 

JCAMP 

Name 

Type 

Description 

origin 

text 

name  of  organization,  including  address,  phone,  individual  contributors 

owner 

text 

name  of  owner  of  a proprietary  spectrum,  including  copyright 

ANDI 

Name 

Type 

Description 

dataset  origin 

text 

name  of  organization,  including  address,  phone,  individual  contributors 

dataset  owner 

text 

name  of  owner  of  a proprietary  dataset,  including  copyright 

operator  name 

text 



name  of  person  who  ran  the  equipment  that  acquired  the  dataset 
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Instrument  group 

The  instrument  group  contains: 

• description 

*=>  general  information  about  the  instrument  and  its  manufacturer 

• properties 

■=>  instrument  settings  that  are  inherent  and  information  about  the  instrument  environment 


• parameters 

■=>  instrument  settings  that  can  be  set  by  the  user. 


Instrument  description 

GRAMS 

Name 

Type 

Description 

source  description 

string 

unique  instrument  and  model  name 

JCAMP 

Name 

Type 

Description 

spectro  system 

text 

manufacturer’s  name,  model,  software  system,  release  number 

ANDI 

Name 

Type 

Description 

comments 

text 

comments  about  instrument 

id 

string 

laboratory’s  identification  code 

manufacturer 

string 

name  of  manufacturer 

model  number 

string 

model  number  or  name 

name 

string 

generic  descriptive  name 

serial  number 

string 

manufacturer’s  serial  number 
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Instrument  properties 

GRAMS 

Name  Type  Description 

bdelay 

float 

begin  delay  (in  s) 

det 

string 

detector  type  name 

gain 

float 

detector  gain  factor 

pmt 

float 

photomultiplier  tube  voltage 

resolution 

string 

resolution  for  data  collection,  including  unit  of  resolution 

sdelay 

float 

scan  delay  (in  s) 

src 

string 

source  type  name 

swppm 

float 

spectral  bandwidth 

JCAMP 

Name  Type  Description 

resolution 

float 

nominal  resolution  (in  abscissa  units) 

ANDI 

Name  Type  Description 

application  software 

string 

name  and  revision  level  software  module 

calibration  history 

string 

array 

audit  trail  of  datasets  that  records  calibration  history 

detector  max  value 

float 

maximum  output  value  of  the  detector  (in  detector  units) 

detector  min  value 

float 

minimum  output  value  of  the  detector  (in  detector  units) 

detector  potential 

float 

potential  of  detector  (in  V) 

detector  unit 

string 

name  of  unit  of  raw  data 

firmware  version 

string 

revision  level  of  instrument  firmware,  applies  to  non-data  components 

operation  system 

string 

name  and  revision  level  of  data  system's  operating  system 

resolution 

float 

spectrometer  resolution 

software  version 

string 

revision  level  of  instrument  software,  applies  to  non-data  components 
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Instrument  parameters 

GRAMS 

Name 

Type 

Description 

channel 

integer 

detector  or  beam  channel  used 

detcor 

items 

detector  correction  mode  [on,  off] 

slit 

float 

slit  aperture  width 

speed 

item 

scan  speed  or  velocity  description 

JCAMP 

Name 

Type 

Description 

deltax 

float 

nominal  spacing  between  points 

parameters 

text 

list  of  essential  instrumental  settings 

ANDI 

Name 

Type 

Description 

point  separation 

float 

separation  of  spectral  data  points 
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Sample  group 

The  sample  group  contains: 

• description 

description  of  the  sample  and  its  classification 
® properties 

<=>  state  and  property  of  the  sample 


• environment 

■=>  information  about  the  sample  environment,  its  preparation,  and  handling. 


Sample  description 

GRAMS 

Name  Type  Description 

id 

integer 

identification 

name 

string 

name 

JCAMP 

Name  Type  Description 

beilstein  number 

string 

structural  formula  code  according  to  the  Beilstein  system 

cas  name 

string 

name  according  to  conventions  as  described  in  CAS  Index  Guide 

cas  registry  number 

string 

registry  number  according  to  Chemical  Abstract  Service  indices,  Merck 
Index  or  CAS  Online 

description 

text 

description  for  compounds,  including  composition,  origin,  appearance, 
interpretations 

molform 

string 

molecular  formula 

names 

string 

array 

list  of  common,  trade,  or  other  names 

wiswesser 

string 

structural  formula  according  to  Wiswesser  notation 
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ANDI 

Name  Type  Description 

cas  name 

string 

name  according  to  Chemical  Abstract  Service 

cas  number 

integer 

registry  number  according  to  Chemical  Abstract  Service 

chemical  formula 

string 

chemical  formula 

comments 

text 

comments  about  sample 

external  id 

string 

number  or  code  assigned  by  submitter 

internal  id 

string 

number  or  code  assigned  within  the  laboratory  or  LIMS 

other  names 

string 

array 

list  of  additional  names 

owner 

text 

name  of  sample  owner  or  submitter 

receipt  time  stamp 

string 

date  and  time  the  sample  was  received  or  submitted  for  analysis 
(relative  to  GMT) 

smiles  notation 

string 

SMILES  notation 

type 

items 

type  of  sample  [standard,  unknown,  control,  blank] 

wiswesser  notation 

string 

Wiswesser  notation 
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Sample  properties 

GRAMS 

Name 

Type 

Description 

amount 

float 

sample  volume  or  amount 

JCAMP 

Name 

Type 

Description 

bp 

float 

boiling  point  (in  °C) 

density 

float 

density  (in  g/mL3) 

mp 

float 

melting  point  (in  °C) 

mw 

float 

molecular  weight 

refractive  index 

float 

refractive  index  (relative  to  air  at  20  °C) 

ANDI 

Name 

Type 

Description 

boiling  point 

float 

boiling  point  (in  °C) 

chemical  mass 

float 

formula  chemical  mass,  computed  using  average  atomic  masses  for 
each  element 

melting  point 

float 

melting  point  (in  °C) 
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Sample  environment 

GRAMS 

Name  Type  Description 

solvent 

string 

solvent  used 

JCAMP 

Name  Type  Description 

concentration 

string 

array 

list  of  known  components  and  impurities  and  their  concentration  and 
units  of  concentration 

path  length 

float 

length  of  the  path  through  sample  (in  cm) 

pressure 

string 

pressure  and  unit  of  pressure 

state 

string 

state  (gas,  liquid,  solid...) 

temperature 

float 

temperature  (in  °C) 

ANDI 

Name  Type  Description 

comments 

text 

comments  concerning  preparation 

disposal  information 

text 

description  of  disposal  procedure 

history 

text 

description  of  the  history  of  the  particular  sample,  including  special 
handling,  treatments 

injection  time  stamp 

string 

date  and  time  the  sample  was  injected  (relative  to  GMT) 

matrix 

text 

description  of  natural  matrix  from  which  the  sample  was  selected 

precautions 

text 

safety  issues  when  the  sample  is  manually  handled 

procedure 

text 

description  of  procedure  used  to  prepare  sample  for  analysis 

procedure  name 

string 

procedure  used  to  select  a sample  from  its  natural  bulk  matrix 

sample  thickness 

float 

thickness  of  the  sample  (in  cm) 

state 

items 

state  [solid,  liquid,  gas,  supercritical  fluid,  plasma,  other  state] 

storage  information 

text 

description  of  storage  location  and  conditions 
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Measurement  group 

The  measurement  group  contains: 

• description 

<=>  general  information  about  the  measurement 


• parameters 

<=>  measurement  settings  influenced  by  the  user. 


Measurement  description 

GRAMS 

Name 

Type 

Description 

collection  timestamp 

string 

date  and  time  the  data  were  collected 

comment 

text 

description  of  measurement 

scanmode 

items 

scan  mode  [spectrum,  time  scan,  multi  wavelength  time  scan] 

scantype 

items 

scan  type  [sample,  zero  line,  baseline] 

JCAMP 

Name 

Type 

Description 

cross  reference 

string 

array 

cross  references  to  additional  spectra  of  the  same  sample 

date 

string 

date  when  spectrum  was  acquired 

sampling  procedure 

text 

description  of  mode  of  observation,  including  additional  information 

source  reference 

string 

adequate  identification  to  locate  original  spectrum 

time 

string 

time  when  spectrum  was  acquired 

ANDI 

Name 

Type 

Description 

experiment  type 

string 

type  of  experiment 

sampling  technique 

items 

sampling  technique  [transmission,  reflectance,  absorbance,  diffuse 
reflectance,  other] 
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Measurement  parameters 

GRAMS 

Name 

Type 

Description 

filter 

string 

optical  filter 

JCAMP 

Name 

Type 

Description 

class 

string 

class  of  spectrum  according  to  Coblentz  Class  and  IUPAC  Class 

ANDI 

Name 

Type 

Description 

calibration  times 

integer 

number  of  times  the  data  were  calibrated  before  yielding  final  results 

processed  times 

integer 

number  of  times  the  data  were  processed  to  yield  final  results 

scan  numbers 

integer 

number  of  scans 

scan  time 

float 

scan  time  (in  s) 
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Data  group 

The  data  group  contains: 

• parameters 

^ information  belonging  to  the  raw  data 

• processing 

o information  about  the  processing  of  the  raw  data  after  acquisition 


• values 

^ the  acquired  data  values  as  described  above  in  each  format  explanation. 


Data  parameters 

GRAMS 

Name  Type  Description 

first 

float 

X value  corresponding  to  first  Y data  point 

last 

float 

X value  corresponding  to  last  Y data  point 

npts 

integer 

number  of  data  points 

wine 

float 

W value  increment  for  4D  data 

wplanes 

integer 

number  of  planes  for  4D  data 

wtype 

items 

allowed  X axis  type  [list  of  types,  e.g.,  °C] 

xtype 

items 

allowed  X axis  type  [list  of  types,  e.g.,  nm] 

yscaling 

integer 

scaling  exponent  for  Y data  values 

ytype 

items 

allowed  Y axis  type  [list  of  types,  e.g.,  AU] 

zinc 

float 

Z value  increment  for  evenly  spaced  Z axes 

ztype 

items 

allowed  X axis  type  [list  of  types,  e.g.,  s] 
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JCAMP 

Name  Type  Description 

firstx 

float 

first  actual  abscissa  value 

firsty 

float 

actual  Y value  corresponding  to  first  X value 

lastx 

float 

last  actual  abscissa  value 

maxx 

float 

largest  actual  X value  in  the  spectrum 

maxy 

float 

largest  actual  Y value  in  the  spectrum 

minx 

float 

smallest  actual  X value  in  the  spectrum 

miny 

float 

smallest  actual  Y value  in  the  spectrum 

npoints 

integer 

number  of  data  points 

xf  actor 

float 

factor  by  which  X values  are  multiplied  to  obtain  actual  values 

xlabel 

string 

label  for  X axis 

xunits 

items 

abscissa  units  [1/cm,  pm,  nm,  s] 

yfactor 

float 

factor  by  which  Y values  are  multiplied  to  obtain  actual  values 

ylabel 

string 

label  for  Y axis 

yunits 

items 

ordinate  units  [transmittance,  reflectance,  absorbance,  arbitrary] 

ANDI 

Name  Type  Description 

data  points  number 

integer 

number  of  actual  data  points 

raw  data  comments 

text 

comments  relevant  to  the  raw  data 

starting  point 

float 

value  of  first  spectral  data  point 

xaxis  label 

string 

label  for  X axis 

xaxis  range 

float  array 

maximum  range  of  X axis,  minimum  and  maximum  value 

xaxis  scale 

float 

scaling  factor  applied  to  the  X axis 

xaxis  unit 

items 

units  for  X axis  [list  of  types,  e.g.,  nm] 

yaxis  label 

string 

label  for  Y axis 

yaxis  range 

float  array 

maximum  range  of  Y axis,  minimum  and  maximum  value 

yaxis  scale 

float 

scaling  factor  applied  to  the  Y axis 

yaxis  unit 

items 

units  for  X axis  [list  of  types,  e.g.,  AU] 

zaxis  label 

string 

label  for  Z axis 

zaxis  range 

float  array 

maximum  range  of  Z axis,  minimum  and  maximum  value 

zaxis  scale 

float  scaling  factor  applied  to  the  Z axis 

zaxis  unit 

items 

units  for  X axis  [list  of  types,  e.g.,  s] 
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Data  processing 

GRAMS 

Name  Type  Description 

correction 

items 

sample  baseline  correction  mode  [on,  off] 

corrtype 

items 

type  of  correction  [normal,  zero  line,  zero  sra,  zero  stdref] 

namebg 

string 

background  or  baseline  spectrum  filename 

namestd 

string 

standard  reference  correction  filename 

namnezr 

string 

zero  reference  correction  filename 

signoise 

items 

signal/noise  processing  mode  [on,  off] 

snlevel 

float 

acceptable  signal/noise  ratio 

JCAMP 

Name  Type  Description 

data  processing 

text 

description  of  data  processing,  e.g.,  including  correction,  smoothing, 
subtraction 

ANDI 

Name  Type  Description 

error  log 

string 

information  about  failures  of  any  type 

post  experiment 

string 

names  of  programs  used  to  process  raw  data  after  acquisition 

pre  experiment 

string 

names  of  programs  run  prior  to  the  start  of  acquisition 
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Conclusion 

The  data  elements  from  the  three  data  formats  show  different  levels  of  detail  in  each  group; 
often  one  format  concentrates  more  on  a specific  group  than  the  others  do.  Not  all  elements 
are  of  the  same  importance.  It  is  difficult  to  decide  which  elements  are  necessary  and  which 
are  not;  that  can  depend  on  a specific  application.  But  after  perusing  the  list,  we  found  that 
some  elements  were  not  directly  part  of  the  experiment,  and  so  we  felt  that  they  should  be 
stored  separately  elsewhere  and  designated,  where  necessary,  with  a link. 

However,  the  extracted  group  structure  provides  a good  start  for  designing  the  corresponding 
XML  structure.  There  is  one  root  element,  which  contains  five  groups  and  their  blocks.  The 
blocks  store  the  collection  of  data  elements.  They  can  then  also  be  divided,  depending  on  their 
datatype.  Taken  together  these  data  elements  are  useful  as  an  initial  vocabulary  for 
SpectroML,  which  covers  those  elements  used  in  most  cases  for  UV/Vis  spectroscopy.  XML 
affords  access  to  all  its  inherent  advantages  and  to  the  tools  that  come  with  it  to  create  a data 
format  more  powerful,  more  flexible,  more  extensible,  and  easier  to  use  than  any  of  those 
currently  available. 
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Appendix  B:  A Short  Introduction  to  XML 

Since  some  familiarity  with  XML  ("Extensible  Markup  Language")  and  its  related  concepts  is 
essential  to  understanding  SpectroML,  we  provide  here  a brief,  basic  background.  The 
purpose  to  provide  the  fundamental  background  needed  to  understand  the  structure  and 
elements  of  SpectroML. 

However,  there  are  a number  of  very  good  tutorials  and  a huge  pool  of  resources  on  that  topic 
on  the  Internet.  Here  are  some  recommended  starting  points: 

• The  World  Wide  Web  Consortium  (www.w3c.org)  contains  all  specifications  to  XML  and  its 
related  technologies,  together  with  references  to  tools  and  more. 

• The  XML  Industry  Portal  (www.xmi.org)  maintains  a repository  for  XML  files  and  has  many 
references  to  tutorials  and  more. 

• Other  related  sites  include:  www.xmi.com  and  www.xmiioi .com. 

Markup 

The  concept  of  markup  is  much  older  than  its  use  with  computers,  but  it  became  popular  with 
the  development  of  HTML,  the  markup  language  for  documents  on  the  Internet.  The  basic 
principle  of  markup  is  tagging-enclosing  parts  of  a document  between  a start  tag  and  an  end 
tag: 

<title>This  is  a title . </title> 

Tags  can  be  structured  hierarchically  to  encapsulate  or  structure  related  data: 

< sample > 

<id>1001</id> 

<name>water</name> 

</ sample> 

Tags  can  contain  attributes  that  contain  data: 

< sample  id= " 1001 " >water</ sample > 

IDs  are  special  attributes.  They  permit  a unique  identification  of  elements  and  are  used  to 
differentiate  one  element  from  another. 

An  XML  file  is  a fully  tagged  text  file;  this  means  that  it  starts  and  ends  with  one  root  tag,  that  it 
contains  an  arbitrary  number  of  subtags,  and  that  all  content  is  enclosed  in  tags.  An  XML  file  is 
human-readable,  but  designed  to  be  processed  by  computers. 
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Document  Type  Definition  (DTD) 

To  ensure  that  an  XML  document  is  valid  and  well-formed,  its  document  type  must  be  defined. 
The  standard  way  to  do  that  is  to  write  a DTD  and  refer  to  the  DTD  in  the  header  of  the  XML 
file. 

The  DTD  specifies  the  names  of  the  elements  and  attributes  and  their  order  of  appearance. 
This  allows  a parser  to  check  a document  and  initiate  further  processing,  for  example,  to 
extract  or  to  change  data. 

The  DTD  mechanism  has  some  drawbacks:  for  example,  the  datatypes  are  basically  all  text 
types,  there  is  no  way  to  assign  datatypes  as  with  a programming  language;  and  a DTD  forces 
elements  to  appear  in  predefined  order.  Despite  its  shortcomings,  it  is  still  the  standard  way  to 
define  XML  documents. 

XML  Schema 

XML  Schema  uses  XML  tags  themselves  to  define  a document  type  instead  of  having  a unique 
syntax  as  DTD.  A schema  is  much  more  powerful  than  a DTD;  for  example,  it  provides  for  a 
variety  of  datatypes  and  allows  an  arbitrary  ordering  of  elements. 

XML  Schemas  have  been  used  for  some  time,  but  only  recently  was  the  approved 
specification  officially  released  by  the  W3C.  The  schema  mechanism  will  likely  replace  DTDs 
in  future. 

Namespace 

Defining  tag  vocabularies  in  document  types  raises  the  problem  of  name  collision  (multiple 
usage  of  the  same  name  tag  for  different  entities).  The  concept  of  namespaces  introduces  a 
unique  prefix  for  each  tag,  so  that  multiply  defined  tags  can  be  distinguished  or  even  used 
within  the  same  document: 

<person1  :name>... 

<person2:name>... 

To  declare  a namespace,  a URL  (Universal  Resource  Locator)  is  assigned  to  each  prefix.  This 
requires  that  valid  locations  for  namespace  definitions  be  maintained;  otherwise  applications 
that  use  the  namespace  may  be  broken. 

Transformation  and  Stylesheets 

A transformation  language,  XSLT,  is  used  with  XML  to  transform  one  class  of  XML  documents 
into  another.  A common  case  is  transforming  an  XML  document  into  a HTML  (HyperText 
Markup  Language)  document  to  display  its  data  with  a network  browser.  The  mapping 
information  for  such  transformations  is  contained  in  a stylesheet.  Stylesheets  contain  rules  that 
define  patterns  in  the  XML  document  and  linkages  to  corresponding  output  elements 
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Appendix  C:  SpectroML  Code 

Sample  file 

<?xml  version="l . 0"  encoding="iso-8859-l"?> 

<!--  SpectroML  sample  file,  6/5/01  --> 

<SpectroML  versions" 1 . 0"> 

(experiment  type= "UV/Vis " language="en-us"  experimentld="e0"> 

<file  experimentLinks=  "eO  11  externalLinks="  "> 

<title>sample  experiment</title> 

<timeStamp> 

<date>2000-ll-01</date> 

<time>10:00:00</time> 

</timeStamp> 

<path  pathld="p0"  instrumentDescriptionLink="idO"  instrumentPropertyLink="ipO" 

sampleDescriptionLink= "sdO " samplePropertyLink="spO " measurementDescriptionLink="mdO 
measurement PropertyLink="mpO " dataPropertyLink="dpO " dataCoreLink="dcO "/> 
<comment>simple  measurement  of  drinking  water</comment> 

</f ile> 

< instrument  > 

< inst  rumentDe script  ion  instrumentDe script ionId= " idO " > 

<instrumentDesignation> 

<identifier>25547UV-247</identif ier> 

<manuf acturer>Hewlett  Packard< /manufacturers 
<model>HP  8453</model> 

(owners 

<name>NIST,  ACSL</name> 

ccontact  >gary . kramer@nist . gov< / contacts 
< /owners 
(locations 

<namesFilter  Lab</names 

< contact  s john . travis@nist . gov< / contacts 
</locations 

<comment sUV/Vis  diode  array  spectrometerc/comments 
</ instrument Designations 
(instrument Applications 

<sof twaresHewlett - Packard  ChemStat ion</ softwares 
<versionsWin  system  1 . 0</versions 

coperatingSystemsWindows  NT  4.0  SP  5</operatingSystems 

<firmwaresl . 0</f irmwares 

(operators 

(namesPaul  DeRose(/names 
(contactspaul .derose@nist .gov</ contacts 
(/operators 

(comment sstandard  installation,  advanced  mode< /comments 
(/instrument Applications 
(/ instrumentDescriptions 

(instrument Property  instrumentPropertyId=" ipO " s 
(instrument Set tings 

(resolution  unit ="nm" si (/resolutions 
(linearDispersion  unit= "mm/nn" >l(/linearDispersions 
(spectralBandWidthRanges 

(minimum  unit="nm"sl . 5(/minimums 
(maximum  unit="nm" si . 5( /maximum s 
(/spectralBandWidthRanges 
( wave 1 e ng  t hRang  e > 

(minimum  unit= "nm" si . 5(/minimums 
(maximum  unit= "nm" si . 5</maximums 
</wavelengthRanges 
(absorbanceRanges 

(minimum  unit="AU" sO</minimums 
(maximum  unit="AU"s4(/maximums 
( /absorbanceRange  s 

(detectorTypessdiode  array(/detectorTypess 
(SourceTypesstungsten  + deuterium  lamp</sourceTypess 
(comment sstandard  properties</ comment s 
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</ instrument Set ting> 

< instrument Parameters 

<slitwidth  unit="mm" >1< /slitWidths 
<spectralSlitWidth  unit= "nm" >l</spectralSlitWidth> 

<beamChannel>l</beamChannel> 

<sampleHolder>multi  sample  unit</sampleHolder> 

<samplePosition>l</samplePosition> 

<scanSpeed  unit= "ms " >500</scanSpeed> 

<pointSeparation  unit="nm" >l</pointSeparation> 

< comment  sstandard  parameter s< /comment > 

</ instrumentParameters 
</ instrument  Property's 
< /instruments 
< samples 

<sampleDescription  sampleDescriptionId= "sdO " s 
<sampleDesignations 

<identif iersl063546374</ identifiers 

<nameswater< /names 

<owners 

<namesNIST,  ACSL</names 

<contact  salexander . ruehl@nist . gov</contact  s 
</owners 
<locations 

<namesFilter  Lab</names 

< contact s john . travis@nist . gov</ contacts 
</locations 

<casNumbers7732-18-5</casNumbers 
<formulasH20< /formulas 

<storageMethodsno  storage</storageMethods 
<disposalMethodswater  sink</disposalMethods 
<commentsone  time  use</comments 
</ sampleDesignations 
< sample Preparations 

<procedureMethodsf ill</procedureMethods 
<timeStamps 

<dates2001-ll-01</dates 
< t ime  s 0 9 : 3 0 : 0 0 < / 1 ime  s 
< /timestamps 
<operators 

<name>Alexander  Ruehl</names 

<contactsalexander . ruehl@nist . gov</contacts 
</operators 
<suppliers 

<namesFilter  Lab</names 

<contact  s j ohn . travis@nist . gov< /contact s 
</ suppliers 

<preparationDescriptionsout  of  crane</preparationDescriptions 
< comment sregular  drinking  waterc/comments 
</samplePreparations 
</sampleDescriptions 

<sampleProperty  samplePropertyId="spO " s 
<sampleAt tributes 

<molecularWeight  unit="AMU"sl8 . 02</molecularWeights 
<meltingPoint  unit= "C" sO</melt ing Point s 
<boilingPoint  unit="C" si 00< /boil ing Points 
<density  unit="g/cc"sO . 995</densitys 

<ref ractivelndex  unit="rel.  air,  20C,  434  nm" si . 3404</ref ractivelndexs 
< comment sususal  properties< /comments 
</sampleAttributes 
<sampleParameters 

< state si iquid</ states 

cpathLength  unit="mm" slOc/pathLengths 
< amount  uni t= "ml "s5< /amounts 
<pressure  unit= "torr " s7 6 0< /pressures 
ctemperature  unit="K" s293</ temperatures 
<commentsf illea  cuvet te</ comments 
</ sampleParameters 
</samplePropertys 
(/samples 
(measurement s 
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■measurement Descript ion  measurementDescriptionId="mdO " > 

<measurementDesignation> 

sidentif iersM876 -UVs/ identifiers 
stitleswater  analysis</title> 

<owner> 

<name>NIST,  Analytical  Chemistry  Division</name> 

< contacts (301) 975-4645</ contacts 
</owners 

<laboratoryReferencespr intout  11/ 01/00  - l</laboratoryRef erences 
<commentssingle  quick  measurements /comments 
< /measurement Designations 
cmeasurementExecutions 

<projectsSpectroML< /projects 
ctimeStamps 

<dates2001- 11 -01< /dates 
<times09 : 30 : 00</times 
</timeStamps 
<operators 

<namesAlexander  Ruehl< /names 

scontactsalexander . ruehl@nist . gov</ contacts 
< /operators 

< comment  s t e s t < / comment  s 
< /measurement Execut ions 
</measurementDescriptions 

smeasurement Property  measurement Propertyld="mp0 " s 
•measurement Parameters 

<measurementTypessample< /measurement Types 
sscanModesdiscrete  wavelengthss/scanModes 

<ref  erenceSample  sampleDescriptionLink="  " sempty  cuvettes/ ref  erenceSamples 
sf iltersnones/ filters 
ssignalNoisesnones/signalNoises 
sscanNumberssls/ scanNumberss 
sscanDuration  unit="s" s5</ scanDurations 
<comment sno  averagings/comments 
< /measurement Parameters 
smeasurement Correct ions 
squalif i cat ionTimeSt amps 
sdates2 000 -07 -19s /dates 
stimesll : 51 : 00s /times 
s/qualif i cat ionTimeSt amps 

squalif icationReferencesqual . csvs/qualif icationReferences 
sprof iciencyTimeStamps 
sdates2000- 07- 19s /dates 
stimesl4 : 00 : 00s /times 
s /prof iciencyTimeStamps 

sprof iciencyRef erencesprof . csvs/proficiencyReferences 
stransmittanceTimeStamps 
sdates2000 - 09 -2  9s / dates 
s time si 0 : 05 : 00s /times 
s/transmittanceTimeStamps 

stransmittanceRef erencestrans . csvs/transmittanceRef erences 
swavelengthTimeS tamps 

sdates2000-09-29s/dates 
stimesIS : 12 : 00s /times 
</wavelengthTimeStamps 

swavelengthReferenceswave . csvs/wavelengthReferences 
scommentsNTRM  correction  infoss/comments 
< /measurement Correct ions 
< /measurement Property s 
s /measurements 
sdatas 

sdataProperty  dataPropertyId= "dpO " s 
sdataParameters 
saxisLabel s 

saxis  dim="x" sWavelengths/axiss 
saxis  dim= "y" sTransmittances/axiss 
s/axisLabels 
saxisUnits 

saxis  dim="x"snms/axiss 
saxis  dim="y" s%Ts/axiss 
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</axisUnit  > 

<minimumValue> 

<value  dim="x">270</value> 

<value  dim="y">0 . 029019</value> 

</minimumValue> 

<maximumValue> 

<value  dim="x" >976</value> 

<value  dim= "y" >0 . 10616</value> 

</maximumValue> 

<comment>lines  at  270  nm,  655nm,  976  nm</comment> 

</dataParameter> 

<dataCalculation> 

<scaleFactor> 

cvalue  dim="x" >l</value> 

<value  dim="y" >l</value> 

</scaleFactor> 

<number Points >3 < /number Point s> 

<point Increment  > 

<value  dim="x">0</value> 

<value  dim="y">0</value> 

< / point Increment  > 

<startValue> 

<value  dim="x" >270</value> 
cvalue  dim="y" >0 . 029019</value> 

</ startValue> 

< comment >discrete  point s</ comments 
</dataCalcuiation> 

</dataProperty> 

cdataCore  dataCoreId="dcO"> 

cvalues  dim  = "x">270  655  976</values> 

cvalues  dim  = "y" >0.10616  0.029019  0 .23453</values> 

</dataCore> 

< / data> 

</experiment> 

</SpectroML> 
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Document  type  definition 

<?xml  version="l . 0"  encoding="iso-8859-l"?> 

<!--  DTD  for  SpectroML,  6/5/01  --> 

<! ELEMENT  SpectroML  (experiment*) > 

< ! ATTLIST  SpectroML 

version  CDATA  #FIXED  "1.0"> 

<! ELEMENT  experiment  (file,  instrument,  sample,  measurement,  data) > 

< ! ATTLIST  experiment 
type  CDATA  "UV/Vis" 
language  CDATA  #IMPLIED 
experimentld  ID  #IMPLIED> 

< ! ELEMENT  file  (title,  timestamp,  path*,  comment?) > 

< ! ATTLIST  file 

experimentLinks  IDREFS  #IMPLIED 
externalLinks  CDATA  #IMPLIED> 

<! ELEMENT  data  (dataProperty+ , dataCore+)> 

< ! ELEMENT  instrument  ( instrumentDescription+ , instrument Property* ) > 

< ! ELEMENT  measurement  (measurementDescription*,  measurement Property* ) > 

< ! ELEMENT  sample  ( sampleDescription* , sampleProperty* ) > 

<! ELEMENT  dataCore  (values*) > 

< ! ATTLIST  dataCore 

dataCoreld  ID  #IMPLIED> 

<! ELEMENT  dataProperty  (dataParameter?  , dataCalculation? ) > 

< ! ATTLIST  dataProperty 

dataPropertyld  ID  #IMPLIED> 

<! ELEMENT  instrumentDescription  (instrumentDesignation? , instrumentApplication? ) > 

<1ATTLIST  instrumentDescription 

instrumentDescriptionld  ID  #IMPLIED> 

<!  ELEMENT  instrumentProperty  (instrumentsetting?,  instrumentParameter?)  > 

< ! ATTLIST  instrumentProperty 

instrumentPropertyld  ID  #IMPLIED> 

<!  ELEMENT  measurementDescription  (measurementDesignation? , measurementExecution? ) > 

<!ATTLIST  measurementDescription 

measurementDescriptionld  ID  #IMPLIED> 

<!  ELEMENT  measurementProperty  (measurementParameter?  , measurementCorrection?  ) > 

<!ATTLIST  measurementProperty 

measurementPropertyld  ID  #IMPLIED> 

<! ELEMENT  sampleDescription  (sampleDesignation? , samplePreparation? ) > 

< ! ATTLIST  sampleDescription 

sampleDescriptionld  ID  #IMPLIED> 

<! ELEMENT  sampleProperty  (sampleAttribute? , sampleParameter? ) > 

<! ATTLIST  sampleProperty 

samplePropertyla  ID  #IMPLIED> 

<!  ELEMENT  dataCalculation  ( scaleFactor?  , numberPoints? , pointlncrement? , startValue?,  comment?) > 
<!ELEMENT  dataParameter  (axisLabel?,  axisUnit?,  minimumValue? , maximumValue? , comment?) > 

<! ELEMENT  instrumentApplication  (software?,  version?,  operatingSystem? , firmware?,  operator?,  comment?)  > 
<! ELEMENT  instrumentDesignation  (identifier?,  manufacturer?,  model?,  owner?,  location?,  comment?) > 

<! ELEMENT  instrumentParameter  (slitwidth?,  spectralSlitWidth? , beamChannel? , sampleHolder? , 
samplePosition? , scanSpeed?,  pointSeparation? , comment?) > 

<! ELEMENT  instrumentsetting  (resolution?,  linearDispersion? , spectralBandWidthRange? , wavelengthRange? , 
absorbanceRange? , detectorTypes? , sourceTypes? , comment?) > 

<! ELEMENT  measurementCorrection  (qualif icationTimeStamp? , gualif icationRef erence? , 

prof  iciencyTimeStamp? , prof iciencyReference?  , transmittanceTimeStamp? , transmittanceRef erence? , 
wavelengthTimeStamp? , wavelengthRef erence? , comment?) > 

<!  ELEMENT  measurementDesignation  (identifier?,  title?,  owner?,  laboratoryRef erence? , comment?) > 

< ! ELEMENT  measurementExecution  (project?,  timestamp?,  operator?,  comment?) > 

<! ELEMENT  measurementParameter  (measurementType? , scanMode?,  ref erenceSample? , filter?,  signalNoise? , 
scanNumbers? , scanDuration? , comment?) > 

<! ELEMENT  sampleAttribute  (molecularWeight?  , meltingPoint? , boilingPoir.t? , density?,  ref ractivelndex? , 
comment? ) > 

<! ELEMENT  sampleDesignation  (identifier?,  name?,  owner?,  location?,  casNumber?,  formula?, 
storageMethod? , disposalMethod? , comment?) > 

<! ELEMENT  sampleParameter  (state?,  pathLength?,  amount?,  pressure?,  temperature?,  humidity?,  comment?) > 
<! ELEMENT  samplePreparation  (procedureMethod? , timestamp?,  operator?,  supplier?, 
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preparaCionDescription? , comment?) > 

<!  ELEMENT  path  EMPTY > 

< ! ATTLIST  path 

pathld  ID  #REQUIRED 

instrumentDescriptionLink  IDREF  #REQUIRED 
instrumentPropertyLink  IDREF  #REQUIRED 
sampleDescriptionLink  IDREF  #REQUIRED 
samplePropertyLink  IDREF  #REQUIRED 
measurementDescriptionLink  IDREF  #REQUIRED 
measurementPropertyLink  IDREF  #REQUIRED 
dataPropertyLink  IDREF  # REQUIRED 
dataCoreLink  IDREF  #REQUIRED> 


< ! ELEMENT 
< I ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< 1 ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 
< ! ELEMENT 


absorbanceRange  (minimum,  maximum) > 
axisLabel  (axis*)> 
axisUnit  (axis*)> 
location  (name,  contact) > 
maximumValue  (value*) > 
minimumValue  (value*) > 
operator  (name,  contact) > 
owner  (name,  contact) > 
point  Increment  (value*) > 
prof iciencyTimeStamp  (date,  time) > 
qualif icationTimeStamp  (date,  time)> 
scaleFactor  (value*) > 

spectralBandWidthRange  (minimum,  maximum) > 
supplier  (name,  contact) > 
startValue  (value*) > 
timestamp  (date,  time)> 
transmittanceTimeStamp  (date,  time) > 
wavelengthRange  (minimum,  maximum) > 
wavelengthTimeStamp  (date,  time)> 


<!  ELEMENT  amount  (# PCDATA ) > 

< ! ATTLIST  amount 

unit  CDATA  #REQUIRED> 

< 1 ELEMENT  axis  (# PCDATA) > 

< ! ATTLIST  axis 

dim  CDATA  #REQUIRED> 

<!  ELEMENT  beamChannel  (# PCDATA ) > 

<! ELEMENT  boilingPoint  (# PCDATA ) > 

< ! ATTLIST  boilingPoint 

unit  CDATA  #REQUIRED> 

<!  ELEMENT  casNumber  (# PCDATA ) > 

<! ELEMENT  comment  (# PCDATA) > 

< ! ELEMENT  contact  (# PCDATA) > 

< ! ELEMENT  date  (#PCDATA)> 

< ! ELEMENT  density  (# PCDATA) > 

< ‘ATTLIST  density 

unit  CDATA  #REQUIRED> 

<! ELEMENT  detectorTypes  (# PCDATA ) > 

<1  ELEMENT  disposalMethod  (# PCDATA) > 

< ! ELEMENT  filter  (# PCDATA) > 

<! ELEMENT  firmware  (# PCDATA ) > 

< ! ELEMENT  formula  (# PCDATA ) > 

<! ELEMENT  humidity  (# PCDATA) > 

< [ATTLIST  humidity 

unit  CDATA  #REQUIRED> 

<! ELEMENT  identifier  (# PCDATA ) > 

< ! ELEMENT  laboratoryRef erence  (#PCDATA)> 
<!ELEMENT  linearDispersion  (# PCDATA) > 
dATTLIST  linearDispersion 

unit  CDATA  #REQUIRED> 

< ! ELEMENT  manufacturer  (# PCDATA ) > 

<!  ELEMENT  maximum  (# PCDATA ) > 

<!ATTLIST  maximum 

unit  CDATA  #REQUIRED> 

<! ELEMENT  measurementType  (# PCDATA ) > 

< 1 ELEMENT  meltingPoint  (# PCDATA) > 
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< ! ATTLIST  meltingPoint 

unit  CDATA  #REQUIRED> 

<! ELEMENT  minimum  (# PCDATA) > 
ciATTLIST  minimum 

unit  CDATA  #REQUIRED> 

< 1 ELEMENT  model  (#PCDATA)> 

<! ELEMENT  molecularWeight  (# PCDATA) > 

< ! ATTLIST  molecularWeight 

unit  CDATA  #REQUIRED> 

<! ELEMENT  name  (# PCDATA ) > 

< I ELEMENT  numberPoints  (# PCDATA ) > 

<! ELEMENT  operatingSystem  {# PCDATA ) > 

< 'ELEMENT  pathLength  (# PCDATA ) > 

< ! ATTLIST  pathLength 

unit  CDATA  #REQUIRED> 

<! ELEMENT  pointSeparation  (#PCDATA)> 
<!ATTLIST  pointSeparation 

unit  CDATA  #REQUIRED> 

<! ELEMENT  preparationDescription  (# PCDATA ) > 

<! ELEMENT  pressure  (# PCDATA) > 

< ! ATTLIST  pressure 

unit  CDATA  #REQUIRED> 

<! ELEMENT  procedureMethod  (# PCDATA ) > 

<! ELEMENT  prof iciencyRef erence  (# PCDATA) > 

< ! ELEMENT  project  (# PCDATA ) > 

<! ELEMENT  qualif icationRef erence  (#PCDATA) > 

<1  ELEMENT  ref erenceSample  (# PCDATA) > 

<! ATTLIST  referenceSample 

sampleDescriptionLink  CDATA  #IMPLIED> 
< ! ELEMENT  ref ractivelndex  (# PCDATA ) > 

< ! ATTLIST  ref ractivelndex 

unit  CDATA  #REQUIRED> 

<! ELEMENT  resolution  (# PCDATA) > 

< ! ATTLIST  resolution 

unit  CDATA  #REQUIRED> 

< ! ELEMENT  sampleHolder  (# PCDATA ) > 

< ! ELEMENT  samplePosition  (# PCDATA ) > 

<!ELEMENT  scanDuration  (# PCDATA) > 

<! ATTLIST  scanDuration 

unit  CDATA  #REQUIRED> 

<! ELEMENT  scanMode  (# PCDATA) > 

< ! ELEMENT  scanNumbers  (#PCDATA) > 

<! ELEMENT  scanSpeed  (# PCDATA ) > 

< ! ATTLIST  scanSpeed 

unit  CDATA  # REQUIRED:. 

<! ELEMENT  signalNoise  (#PCDATA)> 

<! ELEMENT  slitWidth  (# PCDATA) > 

< ! ATTLIST  slitWidth 

unit  CDATA  #REQUIRED> 

< ! ELEMENT  software  (#PCDATA)> 

<! ELEMENT  sourceTypes  (# PCDATA) > 
c! ELEMENT  spectralSlitWidth  (# PCDATA) > 

< ! ATTLIST  spectralSlitWidth 
unit  CDATA  #REQUIRED> 

< ! ELEMENT  state  (# PCDATA) > 

< ! ELEMENT  storageMethod  (#PCDATA)> 

<! ELEMENT  temperature  (#PCDATA)> 

< ! ATTLIST  temperature 

unit  CDATA  #REQUIRED> 

<! ELEMENT  time  (# PCDATA )> 

<! ELEMENT  title  (#PCDATA)> 

<! ELEMENT  transmittanceRef erence  (# PCDATA ) > 

<! ELEMENT  value  (#PCDATA)> 

< 'ATTLIST  value 

dim  CDATA  #REQUIRED> 

< ! ELEMENT  values  (#PCDATA)> 

< ! ATTLIST  values 

dim  CDATA  #REQUIRED> 

<! ELEMENT  version  (# PCDATA ) > 

<! ELEMENT  wavelengthRef erence  (# PCDATA ) > 


Version  1.01  [April  2002] 


62 


SpectroML 


Appendix  C:  SpectroML  Code 


Schema 


<?xml  version^ " 1 . 0 " encoding="iso-8859-l"?> 

<!--  Schema  for  SpectroML,  6/5/01  --> 

<schema  xmlns="http: //www.w3 . org/2001/XMLSchema" > 

<element  name="SpectroML"s 
<complexType> 

< sequences 

<element  ref="experiment"  maxOccurs="unbounded"/> 

</ sequences 

<attribute  name=  "version"  fixed="1.0"  type  = "stnng"/> 
</complexTypes 
</elements 

<element  name = "experiment " s 
<complexTypes 
<sequence> 

<element  ref="file"/s 
<element  ref ="instrument "/> 

<element  ref =" sample "/> 

(element  ref = "measurement "/> 

<element  ref="data"/> 

</ sequences 

<attribute  name="type"  type="string"/> 

<attribute  name=" language"  type=" language "/> 

<attribute  name="experimentld"  type="ID"/s 
</complexTypes 
</elements 

<element  name="file"s 
<complexTypes 
<sequences 

<element  ref ="title"/> 

<element  ref ="timeStamp"/s 

<element  ref="path"  minOccurs="0"  maxOccurs= "unbounded" /> 
<element  ref ^"comment"  minOccurs="0"/s 
</ sequences 

(attribute  name="experimentLinks"  type="IDREFS"/s 
<attribute  name="externalLinks"  type="string"/s 
< / complexTypes 
</elements 

<element  name="data"s 
< complexTypes 
<sequence> 

(element  ref  = "dataProperty"  maxOccurs  = "unbounded " / s 
(element  ref ="dataCore"  maxOccurs= "unbounded" /> 

(/sequences 

(/complexTypes 

(/elements 

(element  name=" instrument "s 
(complexTypes 
(sequences 

(element  ref  ="instrumentDescription"  maxOccurs="unbounded"/s 
(element  ref  = "instrumentProperty"  maxOccurs=  "unbounded" /> 
(/sequences 
(/complexTypes 
(/ elements 

(element  name  = "measurement " s 
(ComplexType  s 
(sequences 

(element  ref  = "measurementDescription"  maxOccurs=  "unbounded" /s 
(element  ref  = "measurementProperty"  maxOccurs=  "unbounded" /> 
(/sequences 
(/ complexTypes 
(/elements 

(element  name=" sample "s 
(ComplexTypes 
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<sequence> 

<element  ref="sampleDescription"  maxOccurs="unbounded" /> 
<element  ref ="sampleProperty"  maxOccurs= "unbounded" /> 
c/sequences 
</complexType> 

</element> 

celement  name="dataCore" > 

<complexType> 

< sequence > 

<element  ref="values"  minOccurs="0"  maxOccurs= "unbounded" /> 
</ sequences 

<attribute  name= "dataCoreld"  type="ID"/> 

</complexType> 

</element> 

<element  name="dataProperty" > 

<complexType> 

<all> 

<elemenc  ref = "dataParameter"  minOccurs="0"/> 

<element  ref = "dataCalculation"  minOccurs="0"/> 

</all> 

<attribute  name="dataPropertyId"  type="ID"/> 

</complexType> 

</element> 

celement  name="instrumentDescription" > 

< comp 1 exType > 

<all> 

celement  ref = " instrumentDesignation"  minOccurs="0"/s 
celement  ref="instrumentApplication"  minOccurs=" 0"/s 
c/alls 

cattribute  name="instrumentDescriptionId"  type="ID"/> 

</ complexTypes 
c/elements 

celement  name="instrumentProperty" s 
c complexTypes 
calls 

celement  ref="instrumentSetting"  minOccurs= " 0 " /> 
celement  ref="instrumentParameter"  minOccurs="0"/> 
c/all> 

cattribute  name="instrumentPropertyId"  type="ID"/s 
</ complexTypes 
c/elements 

celement  name="measurementDescription" s 
c complexTypes 
calls 

celement  ref = "measurementDesignation"  minOccurs="0"/s 
celement  ref ="measurementExecution"  minOccurs="0"/s 
c/alls 

cattribute  name="measurementDescriptionId"  type="ID"/s 
</ complexTypes 
c/element  s 

celement  name="measurementProperty" s 
c complexTypes 
calls 

celement  ref ="measurementParameter"  minOccurs="0"/s 
celement  ref ="measurementCorrection"  minOccurs="0"/s 
c/alls 

cattribute  name= "measurementPropertyld"  type="ID"/s 
c /complexTypes 
c/elements 

celement  name="sampleDescription" s 
c complexTypes 
calls 

celement  ref ="sampleDesignation"  minOccurs="0"/> 
celement  ref="samplePreparation"  minOccurs="0"/s 
c/alls 

cattribute  name="sampleDescriptionId"  type="ID"/s 
</ complexTypes 
c/element  s 
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<element  name= " sample Property " > 

<complexType> 

<all> 

<element  ref="sampleAttribute"  minOccurs="0"/> 
<element  ref="sampleParameter"  minOccurs="0"/> 
</all> 

cattribute  name="samplePropertyId"  type="ID"/> 
</complexType> 

</element> 


<element  name="dataCalculation" > 
<complexType> 

<all> 


<element  ref ="scaleFactor"  minOccurs="0"/> 
<element  ref ="numberPoints"  minOccurs="0"/> 
<element  ref="pointIncrement"  minOccurs="0"/> 
<element  ref ="startValue"  minOccurs="0"/> 
<element  ref="comment"  minOccurs="0"/> 

</all> 

</complexType> 

</element  > 

<element  name="dataParameter"> 

<complexType> 

<all> 


<element 
< element 
<element 
<element 
<element 


ref  = " axisLabel " minOccurs= " 0 " / > 
ref ="axisUnit " minOccurs=" 0 " /> 
ref ="minimumValue"  minOccurs="0"/ > 
ref = "maximumValue " minOccurs= " 0 " / > 
ref="comment"  minOccurs="0"/ > 


</all> 

</complexType> 

</element> 

<element  name= " instrumentApplication" > 
<complexType> 


<all> 

<element  ref="software"  minOccurs="0"/> 
<element  ref = "version"  minOccurs="0"/> 

<element  ref="operatingSystem"  minOccurs="0"/> 
<element  ref="firmware"  minOccurs="0"/> 
<element  ref ="operator " minOccurs="0"/> 
<element  ref="comment " minOccurs="0"/> 


</all> 

</complexType> 

</element> 

celement  name  = "instrumentDesignation"  > 
<complexType> 

<all> 


< element 
<element 
<element 
<element 
<element 
celement 


ref="identif ier"  minOccurs="0"/ > 
ref = "manufacturer " minOccurs="0"/ > 
ref = "model"  minOccurs="0"/> 
ref  = "owner"  minOccurs="  0 "/> 
ref = " location"  minOccurs= " 0 "/> 
ref=" comment"  minOccurs="0"/ > 


</all> 

</complexType> 

</element> 

celement  name=" instrument Parameter" > 
ccomplexType> 
call> 

celement  ref ="slitWidth"  minOccurs="0"/> 
celement  ref="spectralSlitWidth"  minOccurs="0"/> 
celement  ref="beamChannel " minOccurs="0"/> 
celement  ref="sampleHolder"  minOccurs="0"/> 
celement  ref ="samplePosition"  minOccurs= " 0 " /> 
celement  ref="scanSpeed"  minOccurs="0"/> 
celement  ref ="pointSeparation"  minOccurs="0"/> 
celement  ref ="comment " minOccurs="0 "/> 


c/all> 

c / comp  1 exType  > 
c/element> 
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<element  name= " instrumentsetting" > 

<complexType> 

<all> 

<element  ref ="resolution"  minOccurs="0"/> 
celement  ref="linearDispersion"  minOccurs="0"/> 
<element  ref="spectralBandWidthRange"  minOccurs="0"/> 
<element  ref = "wavelengthRange " minOccurs="0"/> 
celement  ref="absorbanceRange"  minOccurs="0"/> 
celement  ref="detectorTypes"  minOccurs="0"/> 
celement  ref="sourceTypes"  minOccurs="0"/> 
celement  ref = "comment " minOccurs="0"/> 
c/all> 

c / comp 1 exType  > 
c/element  > 

celement  name= "measurement Correction" > 
ccomplexType> 
call> 

celement  ref ="qualif icationTimeStamp"  minOccurs="0"/> 
celement  ref="gualificationReference"  minOccurs="0"/> 
celement  ref ="prof iciencyTimeStamp"  minOccurs="0"/> 
celement  ref ="prof iciencyRef erence"  minOccurs="0"/> 
celement  ref ="transmittanceTimeStamp"  minOccurs="0"/> 
celement  ref="transmittanceReference"  minOccurs="0"/> 
celement  ref ="wavelengthTimeStamp"  minOccurs= " 0 " /> 
celement  ref = "wavelengthRef erence " minOccurs="0"/> 
celement  ref ="comment " minOccurs="0"/> 
c/all > 

c/complexType> 

</ element  > 

celement  name="measurementDesignation" > 
c comp 1 exType > 
call> 

celement  ref ="identif ier"  minOccurs="0"/> 
celement  ref="title"  minOccurs="0"/> 
celement  ref="owner"  minOccurs="0"/> 
celement  ref="laboratoryReference"  minOccurs= " 0 " /> 
celement  ref ="comment " minOccurs="0"/> 
c/all> 

c / comp 1 exType > 
c/element  > 

celement  name="measurementExecution" > 
ccomplexType> 
calls 

celement  ref="project"  minOccurs="0"/> 
celement  ref ="timeStamp"  minOccurs="0"/> 
celement  ref ="operator"  minOccurs=" 0 " /> 
celement  ref="comment"  minOccurs="0"/> 
c/all> 

c / comp 1 exType  > 
c/element > 

celement  name= "measurementParameter " > 
ccomplexType> 
calls 

celement  ref="measurementType"  minOccurs="0"/> 
celement  ref ="scanMode"  minOccurs="0"/> 
celement  ref="referenceSample"  minOccurs="0"/> 
celement  ref="filter"  minOccurs="0"/> 
celement  ref ="signalNoise"  minOccurs="0"/> 
celement  ref ="scanNumbers"  minOccurs="0"/> 
celement  ref="scanDuration"  minOccurs="0"/> 
celement  ref="comment"  minOccurs="0"/> 
c/all> 

</ comp 1 exType > 
c/element> 

celement  name="sampleAttribute" > 
ccomplexTypes 
calls 

celement  ref = "molecularWeight " minOccurs="0"/s 
celement  ref="meltingPoint"  minOccurs="0"/s 
celement  ref="boilingPoint"  min0ccurs="0"/s 
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<element  ref ="density"  minOccurs="0"/> 

<element  ref="refractivelndex"  minOccurs="0"/s 
<element  ref ="comment " minOccurs="0"/> 

</all> 

</complexType> 

</elements 

<element  name="sampleDesignation" > 

<complexType> 

<all> 


<element 

<element 

<element 

<element 

<element 

<element 

<element 

<element 

<element 

</all> 

</complexType 
</element> 
<element  name=" 
<complexType> 


re f = " ident if ier " minOccurs= " 0 " / > 
ref="name"  minOccurs="0"/> 
ref="owner"  minOccurs="0"/> 
ref = "location"  minOccurs="0 " /> 
ref = "casNumber " minOccurs=" 0 "/> 
ref="formula"  minOccurs="0"/> 
ref =" storageMethod"  minOccurs=" 0 " /> 
ref = "disposalMethod"  minOccurs= " 0 " /> 
ref = "comment " minOccurs="0"/ > 


sampleParameter" > 


<all> 


<element  ref="state"  minOccurs="0"/> 

<element  ref = "pathLength"  minOccurs="0"/> 

<element  ref="amount"  minOccurs="0"/> 

<element  ref = "pressure " minOccurs="0"/> 

<element  ref =" temperature"  minOccurs="0"/> 

<element  ref = "humidity"  minOccurs="0"/> 

<element  ref ="comment"  minOccurs="0"/> 

</all> 

< / complexType  > 

</element> 

<element  name= " sample Preparation" > 

< complexType > 

<all> 

<element  ref = "procedureMethod"  minOccurs="0"/> 
<element  ref = "timestamp"  minOccurs=" 0 " /> 

<element  ref = "operator " minOccurs="0"/> 

<element  ref="supplier"  minOccurs="0"/> 

<element  ref ="preparationDescription"  minOccurs="0 
<element  ref ="comment"  minOccurs="0"/> 

</all> 

< / complexType  > 

</element> 


/> 
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<element  name="absorbanceRange"> 

<complexType> 

<all> 

<element  ref = "minimum" /> 

<element  ref = "maximum" /> 

</ ail> 

</complexType> 

</element> 

<element  name="axisLabel"> 

< complexType  > 

<seguence> 

celement  ref="axis"  minOccurs="0"  maxOccurs= "unbounded" /> 
< /sequences 
</ complexType > 

</element> 

<element  name="axisUnit"> 

< complexType > 

<sequence> 

<element  ref="axis"  minOccurs= " 0 " maxOccurs= "unbounded" /> 
</ sequences 
</ complexType s 
</elements 

<element  name="location" s 
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< comp 1 exType  > 


<all> 

<element  ref=' 
<element  ref=' 
</all> 

</complexType> 

'name" / > 

'contact " / > 

</element> 

<element  name="maximumValue"> 
<complexType> 


<sequence? 

celement  ref=' 
</sequence> 
</complexType> 

■value"  minOccurs="0"  maxOccurs="unbounded"/> 

</element> 

<element  name="minimumValue"> 
< comp 1 exType > 


<sequence> 

< element  ref=' 
</sequence> 

</ comp 1 exType > 
</element> 

'value"  minOccurs="0"  maxOccurs= "unbounded" /> 

<element  name =" ope rat or "> 
<complexType> 


<all> 

<element  ref=' 
<element  ref=' 
</all> 

</ comp 1 exType > 
</element> 

'name "/> 

'contact "/> 

<element  name=" owner "> 
<complexType> 


<all> 

<element  ref=' 
<element  ref=' 
</all> 

</complexType> 

'name "/> 

'contact "/> 

</element> 

<element  name="path"> 
<complex.Type> 


< at tribute  name  = 
<attribute  name: 
cattribute  name: 
< at tribute  name  = 
<attribute  name: 
<attribute  name: 
<attribute  name: 
<attribute  name: 
< at tribute  name= 
</ comp 1 exType > 

="pathld"  use="required"  type="ID"/> 

="instrumentDescriptionLink"  use="required"  type="IDREF"/> 
="instrumentPropertyLink"  use="required"  type="IDREF"/> 
="sampleDescriptionLink"  use="required"  type= " IDREF" / > 
="samplePropertyLink"  use="required"  type="IDREF"/> 
="measurementDescriptionLink"  use="required"  type="IDREF"/> 
= "measurement PropertyLink"  use="required"  type=" IDREF" /> 

= "dataPropertyLink"  use= "required"  type=" IDREF" /> 
="dataCoreLink"  use="required"  type=" IDREF" /> 

</element> 

<element  name= "pointlncrement " > 
< comp 1 exType > 


<sequence> 

<element  ref= 
</ sequence? 
</complexType> 

"value"  minOccurs="0"  maxOccurs="unbounded"/> 

</element> 

<element  name= "prof iciencyTimeStamp" > 
<compl exType? 


<all> 

<element  ref= 
<element  ref= 
</all> 

</compl exType? 

"date"/ ? 

"time"/ ? 

</element> 

<element  name=  "qualif icationTimeStamp" > 
< comp 1 exType > 

<all? 

<element  ref="date"/> 

<element  ref="time"/> 
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</all> 

c/ complexTypes 
</element> 

<element  name="scaleFactor" > 

< comp 1 exTvpe > 

< sequences 

<element  ref="value"  minOccurs="0"  maxOccurs="unbounded"/> 
</ sequences 
</ complexTypes 
</elements 

<element  name="spectralBandWidthRange" s 
< complexTypes 
<alls 

<element  ref ="minimum"/s 
<element  ref = "maximum" /> 

</alls 

</ complexTypes 
</elements 

<element  name="startValue"  s 
< complexTypes 
<sequences 

<element  ref="value"  minOccurs="0"  maxOccurs="unbounded"/> 
</sequences 
</ complexTypes 
</elements 

<element  name  = "supplier"  s 
< complexTypes 
<alls 

<element  ref="name"/> 

<element  ref="contact"/s 
</alls 

</ complexTypes 
< /elements 

<element  name="timeStamp"s 
< complexTypes 
<alls 

<element  ref="date"/s 
<element  ref="time"/s 
< /alls 

< /complexType s 
</elements 

<element  name="transmittanceTimeStamp" s 
< complexTypes 
calls 

celement  ref="date"/> 
celement  ref="time"/> 

</alls 

</ complexTypes 
</elements 

< e 1 ement  name  = " wave 1 engt hRange " s 
< complexTypes 
calls 

celement  ref = "minimum" /> 
celement  ref =" maximum" /> 
c/alls 

c / complexTypes 
c/elements 

celement  name  =" wave 1 eng thTime St amp" s 
c complexTypes 
calls 

celement  ref="date"/s 
celement  ref="time"/s 
c/alls 

</ complexTypes 
c/elements 

celement  name =" amount "s 
c complexTypes 

csimpleContents 

cextension  base="double" s 
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<attribute  name="unit"  use="required"  type="string"/> 
</extension> 

< /simpleContent s 
</complexType> 

</element> 

<element  name="axis"> 

<complexType> 

< simpleContent > 

<extension  base="string"> 

<attribute  name="dim"  use="required"  type="string"/> 
</extension> 

< / simpleContent  > 

</complexType> 

</element> 

<element  name= "beamChannel " type="string" /> 

<element  name="boilingPoint " s 
<complexType> 

<simpleContent> 

<extension  base= "double " > 

<attribute  name="unit"  use=" required"  type=" string" /> 
</extension> 

</ simpleContents 
</complexType> 

< /elements 

<element  name="casNumber"  type="string"/> 

<element  name=" comment"  type="string"/> 

<element  name=" contact"  type="string"/> 

<element  name="date"  type="date"/> 

<element  name="density" > 

<complexType> 

<simpleContent> 

<extension  base="double"> 

<attribute  name="unit"  use= " required"  type="string"/> 
< / extensions 
</ simpleContent s 
</ complexTypes 
</elements 

<element  name="detectorTypes"  type="string"/s 
<element  name="disposalMethod"  type="string"/s 
<element  name= " filter " type="string"/s 
<element  name=" firmware"  type="string"/> 

<element  name=" formula"  type="string"/s 
<element  name= "humidity" s 
< complexTypes 
< simpleContent > 

<extension  base="double " s 

<attribute  name="unit"  use="required"  type="string"/s 
</extensions 
</ simpleContent  s 
< /complexTypes 
</element> 

<element  name="identifier"  type="string"/s 
celement  name=" laboratoryRef erence " type="string"/s 
< element  name=" linearDispersion" s 
< complexTypes 

<simpleContents 

<extension  base="double"> 

<attribute  name="unit"  use="required"  type="string"/s 
</extensions 
< /simpleContent s 
</ complexTypes 
</elements 

<element  name= "manufacturer " type="string"/s 
<element  name= "maximum" s 
< complexTypes 

<simpleContents 

<extension  base="double " s 

<attribute  name="unit"  use= " required"  type="string"/s 
</extensions 
</simpleContent  s 
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</complexType> 

</element> 

<element  name= "measurementType " type="string"/> 

<element  name="meltingPoint " > 

<complexType> 

<simpleContent> 

<extension  base="double" > 

<attribute  name="unit"  use= "required"  type="string"/> 

</extension> 

</ simpleContent> 

</complexType> 

</element> 

<element  name= "minimum" > 

<complexType> 

<simpleContent> 

<exCension  base= "double "> 

<attribute  name="unit"  use="required"  type="string"/> 

</extension> 

</ simpleContent  > 

</complexType> 

</element  > 

<element  name="model"  tvpe="string"/> 

<element  name="molecularWeight " > 

<complexType> 

<simpleContent> 

<extension  base="double " > 

<attribute  name="unit"  use="required"  type="string"/> 

</extension> 

</ simpleContent> 

</complexType> 

</element> 

<element  name="name"  type=" string" /> 

<element  name="numberPoints"  type="unsignedInt"/> 

<element  name="operatingSystem"  type="string"/> 

<element  name="pathLength" > 

<complexType> 

< simpleContent > 

<extension  base=" double "> 

<attribute  name="unit"  use=" required"  tvpe="string"/> 

</extension> 

</ simpleContent> 

</complexType> 

</element> 

<element  name="pointSeparation" > 

<complexType> 

<simpleContent> 

<extension  base=" double "> 

<attribute  name="unit"  use="required"  type="string"/> 

</extension> 

</ simpleContent > 

</complexType> 

</element> 

celement  name="preparationDescription"  type="string"/> 

<element  name= "pressure " > 

<complexType> 

< simpleContent > 

<extension  base= "double " > 

<attribute  name="unit"  use="required"  type="string"/> 

</extension> 

</simpleContent> 

</complexType> 

</element> 

<element  name="procedureMethod"  type="string"/> 

<element  name="prof iciencyReference"  type="string"/> 

<element  name= "project"  type="string"/> 

<element  name="qualif icationReference"  type="string"/> 

<element • name="ref erenceSample " > 

<complexType> 

<simpleContent> 

<extension  base="string"> 
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<attribute  name=  " sampleDescriptionLink"  type=" string "/> 
</extension> 

< / simpleContent  > 

< / complexType> 
c/element > 

<element  name="ref ractivelndex" > 

<complexType> 

<simpleContent  > 

cextension  base="double"> 

<attribute  name="unit"  use="required"  type="string"/> 
</extension> 

< /simpleContent > 

< / complexType> 

</element  > 

<element  name=" resolution" > 

<complexType> 

<simpleContent> 

<extension  base="double"> 

<attribute  name="unit"  use=" required"  type="string"/> 

</ extension> 

</ simpleContent > 

</complexType> 

</element> 

<element  name="sampleHolder"  type="string"/> 

<element  name="samplePosition"  type="string"/> 

<element  name="scanDuration" > 

<complexType> 

<simpleContent  > 

cextension  base="double"> 

cattribute  name="unit"  use=" required"  type="string"/> 

< / extension;* 

</s impleContent  > 

</complexType> 

</element> 

celement  name="scanMode"  type="string" /> 
celement  name="scanNumbers"  type="unsignedInt"/> 
celement  name="scanSpeed" > 
ccomplexType> 
csimpleContent  > 

cextension  base="double"> 

cattribute  name="unit"  use="required"  type=" string" /> 
c/extension> 
c/simpleContent  > 

< / complexType  > 
c /element > 

celement  name="signalNoise"  type="string"/> 
celement  name="slitWidth" > 
c complexType > 
csimpleContent  > 

cextension  base="double"> 

cattribute  name="unit"  use="required"  type="string"/> 
</extension> 
c /simpleContent  > 
c / comp 1 exType  > 
c/element> 

celement  name="software"  type="string"/> 
celement  name="sourceTypes"  type="string"/> 
celement  name= "spectralSlitWidth" > 
ccomplexType> 
csimpleContent> 

cextension  base="double" > 

cattribute  name="unit"  use="required"  type="string"/> 
c/extension> 
c / simpleContent > 
c/complexType> 
c/element  > 

celement  name="state"  type="string"/> 
celement  name="storageMethod"  type="string"/> 
celement  name  = "temperature " > 
c complexType > 
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<simpleContent  > 

<extension  base="double"> 

<attribute  name="unit"  use="required"  type="string"/> 
</extension> 

</ simpleContent> 

</complexType> 

</element> 

<element  name="time"  type="time"/> 

<element  name="title"  type="string"/> 

<element  name= "transmittanceReferen.ee " type="string"/> 
<element  name= "value "> 

<complexType  > 

<simpleContent> 

<extension  base="double"> 

<attribute  name="dim"  use=" required"  type="string"/> 
</extension> 

</ simpleContent> 

</complexType> 

</element  > 

<element  name= "values "> 

<complexType> 

<simpleContent> 

cextension  base="valueList " > 

<attribute  name="dim"  use="required"  type="string"/> 
</extension> 

</ simpleContent> 

</complexType> 

</element> 

<element  name= "version"  type="string"/> 

<element  name="wavelengthReference"  type="string"/> 

<simpleType  name= ' valueList ' > 

<list  itemType= ' double' /> 

</ simpleType> 

</ schemas 
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Stylesheet 

<?xml  version="l . 0"  encoding="iso-8859-l"?> 

<!--  SpectroML  stylesheet  file,  6/5/01 

<xsl : stylesheet  version="l . 0"  xmlns : xsl="http : / /www. w3 . org/1999/XSL/Transf orm" > 

<xsl : template  match="/"> 

<htmlxxsl : apply-templates/></html> 

</xsl : templates 

<xsl : template  match=  "SpectroML" > 

<headxtitle>SpectroML</titlex/head> 

cbodyxfont  size="6"xb>SpectroML  <sup><xsl : value-of  select="@version"/x/supx/bx/font> 

<xsl : apply-templates/ ></body> 

</xsl : templates 

<xsl : template  mat ch=" experiment " s 
<hr/ xtablextrxtds 

<font  size="5" sexperiment  <bs [<xsl :value-of  select="@experimentId"/>] </bs 

<is  (<xsl : value-of  select  = "@type"/s,  <xsl : value-of  select  = "(Slanguage " / >)  </ix/font> 

<xsl : apply-templates/s 
</tdx/trs< /tables 
</xsl : templates 

<xsl : template  match="file | instrument | sample | measurement | data" > 

<trs<tds<br/s<font  size=”4"s<bs<xsl :value-of  select="name () "/s  group</b></fonts</tdx/tr> 

<xsl : apply-templates/ s 
</xsl : templates 

<xsl : template  match="instrumentDescription | instrument Property | sampleDe script ion | sampleProperty | 
measurementDescription  | measurement  Property  | dataProperty  | dataCore"  s 
<tr s<tds<font  size  = "4"s<xsl :value-of  select  = "name  ( ) "/>  block 
<b>  [<xsl : value-of  select  = "/>]  < /bs</f  ontx/tdx/trs 

<xsl : apply-templates/s 
</xsl : templates 

<xsl : template  match=" instrumentDesignat ion | instrumentApplication | instrument Set ting [ 
instrumentParameter  | sampleDesignation  | samplePreparation  | sampleAttribute  | 
sampleParameter  | measurementDesignation  | measurementExecution  | measurement  Parameter  | 
measurementCorrection | dataParameter | dataCalculation"  s 

<trs<tds<font  size  = "3  " s<bs<i>  <xsl : value-of  select  = "name  ( ) "/>  section</ix/bx/fontx/td></tr> 
<xsl : apply-templates/ s 
<trs<tds&#160;</tds</trs 
</xsl : templates 

<xsl : template  match= "path" s 

<trs<tds<bspath  [<xsl : value-of  select= "@pathld" / >]  : </bs</tds 
<tds<bs<xsl : value-of  select="@instrumentDescriptionLink"/s  - <xsl : value-of 
select="@instrumentPropertyLink"/s  - <xsl : value-of 
select="@sampleDescriptionLink"/s  - <xsl : value-of 
select="@samplePropertyLink" />  - <xsl : value-of 
select="@measurementDescriptionLink"/s  - <xsl : value-of 
select="@measurementPropertyLink"/s  - <xsl : value-of 

select="@dataPropertyLink"/s  - <xsl : value-of  select="@dataCoreLink"/s</bs</tds</trs 
<xsl :apply-templates/s 
</xsl : templates 

<xsl : template  match= "axis | value | values " s 

<trs< tds<bs<xsl : value-of  select="name () "/xis  (<xsl: value-of  select= "@dim"/s) </is:</bs</tds<tds 
<xsl : apply- templates/ s< /tds</trs 
</xsl : templates 

<xsl : template  match="*"s 

<trs<tds<bs<xsl : value-of  select  = " name  ()  "/>:</bs</tdxtds 
<xsl : apply-templates/s 

<xsl : i f test  = "boolean(@*)  "xis  (<xsl : value-of  select  = "@*"/s) </is</xsl : if ></tds</trs 
</xsl : templates 
</xsl : stylesheets 
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